


Indian Agricultural 
Research Institute, New Delhi. 


I. A. R. I. 6. 

MGIPC— SI— 6 AR/54— 7-7-64— 10,000. 




BIOMETRIKA 


A JOURNAL FOR THE STATISTICAL STUDY OF 

* 

BIOLOGICAL PROBLEMS 


FOUNDED BY 

W. F. R. WELDON, FRANCIS GALTON and KARL PEARSON 


EDITED BY 

EGON S. PEARSON 


IN CONSULTATION WITH 


HARALD CRAMER 
R. C. GEARY 
MAJOR GREENWOOD 


J. B. S. HALDANE 
G. M. MORANT 
JOHN WISHART 


VOLUME XXXIV 
1947 


ISSUED BY THE BIOMETRIKA OFFICE 
UNIVERSITY COLLEGE, LONDON 
AND PRINTED AT THE 
UNIVERSITY PRESS, CAMBRIDGE 



PBIKTBD IN «RJEAT BRITAIN 



CONTENTS OF VOLUME XXXIV 

Memoirs 

I. The variance of the overlap of geometrical figures with reference 
to a bombing problem. By F. Garwood. With two Figures in 
the Text 

II. A study of a first dynasty series of Egyptian skulls from Askkara 
and of an eleventh dynasty series from Thebes. By A. Batrawi 
and G. M. Morant, With one Figure in the Text and one 
Folding Chart . . . . . . * . 

III. The generalization of ‘Student’s’ problem when several different 

population variances are involved. By B. L. Welch. 

IV. The distribution of Kendall’s r coefficient of rank correlation in 

rankings containing ties. By G. P. Sillitto. With one Folding 
Chart . . . 

V. The use of range in place of standard deviation in the f-test. By 
» E. Lord 

"VI. The frequency distribution of s jb l for samples of all sizes drawn at 
random from a normal population. By R. C. Geary. With eight 
Figures in the Text 

VIJ- On the computation of universal moments of tests of statistical 
normality derived from samples drawn at random from a normal 
universe. Application to the calculation of the seventh moment 
of b 2 . By R. C. Geary and J. P. G. Worlledge 

VIII. The asymptotical distribution of range in samples from a normal 
population. By G. Elfving. With two Figures in the Text 

IX. Limits of the ratio of mean range to standard deviation. By R. L. 
Plackett 

X. Significance tests for 2x2 tables. By G. A. Barnard. With two 
Figures in the Text 

XI. The choice of statistical tests illustrated on the interpretation of 
data classed in a 2 x 2 table. By E. S. Pearson 

XII. 2x2 tables. A note on E. S. Pearson’s paper. By G. A. Barnard 

XIII. The cumulants of the z and of the logarithmic x 2 and t distributions. 

By J. Wishart 

XIV. The meaning of a significance level. By G. A. Barnard 

XV. On the distribution of the rank correlation coefficient r when the 
variates are not independent. By W. Hoffding 

XVI. The significance of rank correlations where parental correlation 
exists. By H. E. Daniels and M. G. Kendall 

XVII. Testing for normality. By R. C. Geary . . . . . 


pages 

1—17 

18—27 
28 — 35 

30 — 40 
41—67 

68—97 

98—110 

111—119 

120—122 

123—138 

139—107 

168—169 

170— i78 
179—182 

183—196 

197—208 

209—242, 



Contents 


XVIII. The stratified semi-stationary population. By S. Vajda . 

XIX. A simple approach to confounding and' fractional replication in 
factorial experiments. By 0. Kempthorne ' . 

XX. A comparison of stratified with unrestricted random sampling from 
a finite population. By P. Armitage. With four Figures in the 
Text . * 

XXI.' Some theorems on time series. I. By P. A. P. Moran . 

XXII. Rank correlation between two variables, one of which is ranked, the 
other dichotomous. By J. W. Whitfield *. 

XXIII. The variance of r when both rankings contain ties. By M. G. 

Kendall * 

XXIV. A y 2 ‘ smooth ’test for goodness of fit. By F.N. David . 

XXV. An exact test for the equality of variances. By R. L. Plackett . 

XXVI. The estimation from individual records of the relationship between 
dose and quantal response. By D. J. FinNey. With four Figures 
in the Text 

XXVII. A power function for tests of randomness in a sequence of alter- 
natives. By F. N. David. With one Figure in the Text . 

XXVIII. A numerical solution of the problem of moments. By H. 0. Hartley 
and S. H. Khamis 

XXIX. Approximation to percentage points of the 2 -distribution. By A. H. 

Carter 


Miscellanea 

(i) Note on the cumulants of Fisher’s 2 -distribution. By L. A. AroIan . 

(ii) A note on the mean deviation from the median. By K. R. Nair. 

(iii) On the method of paired comparisons. By P. A. P. Moran. 

(iv) Notes on the calculation of autocorrelations of linear autoregressive 

schemes. By M. H. Qtjenoutlle . . . . . * . 

(v) Approximate formulae .for the percentage points of the incomplete beta 

function and of the y 2 distribution. By D. Halton Thomson . 

(vi) Review of C. E. f W e^therburn’s A First Course in Mathematical 

Statistics. By F. N. David 

• ^ 

(vii) Review of Advances in Genetics, Vol. 1, Academic Press, New York. By 

J. B. S. Haldane 

(viii) Review of H. Cramer’s Mathematical Methods of Statistics. By F. N. 
David . . . • . • ; 


RAGES 

243 — 254 
266—272 

273—280 

281—291 

292—296 

* 

297—298 

299—310 

311—319 

320—334 

335^339 

340—351 

352—358 

PAGES 

359— 360 

360— 362 
363—365 

366—367 

368—372 

373 

373 

. 374 



Vodttme XXXIV, Parts I and II January 1947 

THE VARIANCE OF THE OVERLAP OF GEOMETRICAL FIGURES 
WITH REFERENCE TO A BOMBING PROBLEM 
By F. GARWOOD, Ph.D. 

1. Introduction 

The present paper deals with a particular problem arising in the mathematical study of 
bombing. Briefly, the general problem is that of predicting the over-all effects of a bombing 
attack carried out under given conditions against a given target, and the mathematical 
treatment involves various simplifying assumptions concerning these conditions. 

In the type of problem considered here, attention is oentred on the total plan area of 
damage caused to a single building by bombs falling* independently and at random over a 
larger area containing* the building. It is assumed -that each bomb damages all that part of 
the building contained within a circle of fixed size oentred at the bomb (a square damage 
area is also considered), while the building has a simple plan outline, such as a rectangle or 
a circle. The area of damage of two or more adjacent bombs is merely the area covered by 
the circles. The theoretical problems Tiealt with are those of estimating the varianoe of the 
amount of damaged area (the estimation of the mean or expected damage presents little 
difficulty). It would be more satisfactory to obtain the complete frequency distributions, 
but this has so far not been achieved, nor has it been possible to obtain explicit formulae for 
the 3rd and 4th moments. 

As there may be applications of the problems to fields completely different from those of 
bombing studies, and as they are problems which involve essentially the concepts of geometry 
and of probability, it is convenient to express them entirely in these terms. 

We thus have problems of the following type. A number of circles are placed at random on 
a plane so that each one has some or all of its area inside a fixed square. What are the mean 
and variance of the area of the square covered by the circles ? The fundamentals of this type 
of problem have been studied by Robbins (1944), and Bronowski & Neyman have 
dealt with another particular case.* Robbins’s results enable us to deal with geometrical 
figures other than circles and squares, and also to deal, with cases where the number, 
position and orientation of the / covering ’ figures follow probability laws other than the 
simple ones implied in the above example. 

. 2. Robbins’s theorem 

In leading up to his theorem, Robbins uses the concept of a random measurable subset X 
of n-dimensional Euclidean space E n . He defines the function g(x, X) for every point x 
of E n and for every X as equdl to 1 for xeX and zero elsewhere. This theorem is then as 
follows: 

Let X be a random Lebesgue measurable subset of E n , with measure fi(X). For any point 
x of E n let p(x) =■= Pr(xeX). Then , assuming that the function g(x> X) is a measurable function 
of the pair (#,X), the expected value of the measure of X will be given by the Lebesgue integral of 
the function p(x) over E n . 

* Note by Editor. This paper was received for publication in September 1945; Dr Garwood has 
asked me to add the following note in proof. “The author had the privilege of seeing the work of 
Bronowski & Neyman in proof. This paper was then submitted, after, which their work was published 
(1945) together with a second article by Robbins (1945), who has solved, among others, some of the 
problems dealt with in this paper, as acknowledged in later footnotes.** 

Bfometrika 34 


CiinHtb 



2 


The variance of the overlap of geometrical figures f 

Robbins generalizes this result to obtain the mth moment of the measure of X\ this is the 
integral of the function p(x v x 2 , . over E mn > where 

= Pr(x x eX and x 2 eX... anda m €X). (1) 

It is useful to give a simple non-rigorous proof of this result. Suppose the space E n to be 
divided into an enumerably infinite set of small elements (o v (o 2i .... If we assume that any 
particular subset X can be made up of a selection of the a)% then 

fi(X) = AjCt^ + A 2 ct> 2 + ...» 

where (o t is here used also as the measure of the element <o i9 and where the A’s, appropriate 
to this particular X , are 0 or 1. Hence 

W^)} m = SS...A p A r .. V9 ..., 

P Q 

where each summation is over the whole of E ni and there are m such summations. Thus 
exp{/4(X)} w = S2---^^--.exp(A p A (Z ...). 

V Q 

But the expectation of A p A g ... is the probability that the elements (o p ,o) q9 ... are in X , and 
on proceeding to the limit the desired result is obtained. 

The verification of this result in the case, say, of the 2nd moment of a linearly distributed 
variate, is instructive. Thus suppose x is a variate with a probability function F(x ), i.e. the 
probability of obtaining a value ^ x is given by the measurable function F(x), where 

F(-oo) = 0 and F(oo) = l. 

Define X as the interval from 0 to x; then the expectation of the square of the measure of 
X is the 2nd moment of x. To use Robbins’s theorem we used 
the probability p(x v x 2 ) that a given pair of v alues x x and x 2 
both lie in the interval 0, x chosen at random. Using co- 
ordinate axes Ox v Ox 2i this probability is zero in the 2nd and 
4th quadrants, since O , x cannot contain two points x x and x 2 
of opposite signs. In the region A (see Fig. 1 ), where x x > x 2 > 0, 
the two values x 1 and x 2 are both in 0>x if x >x v and the 
probability of this is 1 - F^). Thus in A, 

p(x v x 2 ) = 1-Ffo). 

Similarly in B p(x v x 2 ) = 1 - F(x 2 ), 
while in C p(x v x 2 ) = F(x x ) 

and in D p(x x> x 2 ) = F(x 2 ). 

The integral of p(x ly x 2 ) over A is seen to be 

J o ^i[l F(x x )]dx x , 

while the total integral of p(x v x 2 ) over the whole plane is 

2 J x[l~ F(x)]dx— 2 J xF(x)dx . 

A single integration by parts then leads to 

J x 2 dF(x), 

which is the 2nd moment of x> as required. 





3 


F. Garwood 


3. Application of Robbins’s theorem to overlap problems 
We shall be concerned with cases in E 2 where the subset X is the part of a fixed area A in the 
plane which is covered by a number of areas C dropped independently and at random on the 
plane. We suppose A to be the interior of a simple closed curve, while each C is the interior 
of another curve. The area C has a reference point Q (conveniently called its centre) and a 
reference line, and it is assumed, that there is a frequency distribution (f>(x , y, 0) of the 
position (x f y) of Q and of the inclination 8 of the reference line to a fixed direction in E 2 . 
<f>(x, y, 6) can be assumed to be zero outside an area T, i.e. the points Q are distributed inside 
T. (In the applications the angle 0 will be constant and the areas G will be equally likely to 
fall anywhere over T , so that we can write (f>{x, y, 6) = l/T.) Another chance variable is k , 
the number of areas C\ its distribution can be defined by the series p 0 , p v p 2 > Pk> ••• 
(which are the probabilities of 0, 1 , 2 , k , ... G 9 s falling on I 7 ), or by the probability 
generating function Q(u ), where 

G(u)=p 0 +p 1 u+p 2 u 2 + ... +p k u k +.... ( 2 ) 

Finally, it is more convenient to consider the moments of the area F = A — X, i.e. the area 
of A (we can use the symbols F, etc. for either the sets or their areas) not covered by the G 9 s. 
Evidently the variances of X and Y are equal. 

To obtain the 1st moment of F, we need first the probability p(x ly y x ) that a point (a?j, y x ) 
of A will belong to F, i.e. of (x v y x ) not being covered by a C. Now (x v y x ) will not be covered 
by a particular G falling at an inclination 6 if the centre Q(x , y) falls outside an area 
G(x v y 1 ,0) obtained by centring the C at (x v y x ) and rotating it through 180°. If the part 
of T exterior to this area is called T — G(x v y v 0 ), and if we allow all inclinations, the prob- 
ability of this occurring is 

g(*i.yi ) = J 0 J jV(x, y, Q)dxdyd0. (3) 

T-J5{x uVu 0) 

If k C' s are dropped independently, the probability is given by q k (x v y x ), so that the total 
probability of (x ly y x ) belonging to F is 

00 

Pfrv Vi) = £ 2 >*?*(* 1 . Vi) = 0{q(x x , y x )}. ( 4 ) 

A* = 0 

The 1st moment of F is thus, in the case of k C 9 s, 


Y ) = J jvH* i> Vi) dx x dy x , (5) 

and in the general case A 

= J [o{q(x 1 ,y 1 )}dx x dy x . (0) 

A 

The 2nd moment is obtained by a similar process; we require the probability p{x ly y v x 2i y 2 ) 
that neither of two points (x v y x ) and ( x 2 , y 2 ) is covered by a C . Corresponding to these two 
points and an inclination 0 the permissible region in which each centre Q can fall is 

T - C(x l9 y v 0)- C(x 2) y 2 ,0) = T-C\-C 2 . 


Thus for one C the probability is 
giving in the case of k C’s, 


Jo J ^<j>(x,y,d)dxdydd, 


r— Ci— Ci 


P(* i. yv x a , y 2 ) = q k (x x , y x , x 2 , y a ) and p(x x , y x , x a , y a ) = G{q(x x , y x , x a , y a )} 


( 7 ) 



4 


The variance of the overlap of geometrical figures 


in the general case. Thus the 2nd moment 

fi*{ T) = Vi> x f Vt) dx^dxtdyt for * C’s 

and /*» ( 7) = JJJJ 0{q(x lt y v x % , y t )} dx x dy x dx t dy % 


< 8 ) 

( 9 ) 


in the general case. In general, for the mth moment, the probability that (x x ,y x ) ... (x m ,y m ) 
are not covered by k C’s is 

Ifi. . x m ,y m ), 


where 


?(« vVv x t ,y t , 



( 10 ) 


and T-C x -C z ...-C m is the area of T outside C’s centred at (x x ,y x ), (x a ,y % ), ..., (x m , y m ) 
and rotated through 180°. 

Thus the mth moment is equal to 


Pm(Y) = J J... J jq k (*i,yv x * >y*> • 

A A 

in the case of k C’s, or 

= I J ‘ J \°( q(Xl ’ Vl ’ x * y * 

A A 

in the general case. 


*m. y m )dx 1 dy 1 dx i dy 2 ! . . dx m dy m 


•. *m. y m )} dx x dy x dx 2 dy 2 . . . dx m dy m I 


7 


( 11 ) 


4. Uniform distribution of covering areas at constant inclination 

As mentioned above, in the cases with which we shall be dealing, the areas C are equally 
likely to fall anywhere over T, and the angle 6 is constant. The function y y 0) can be put 
equal to IjT for points of T and zero outside; the variable 6, and integration with respect 
to it, may be omitted. 

The function q(x v y x ) is the fraction of T not covered by a C centred at (x v y t ) and rotated 
through 180°, and in general q(x v y v x 2i y 2 y is the fraction of T outside m C's 

centred at (x v y 1 ) > (x 2f y 2 ), ..., (x m ,y m ) and rotated through 180°. 

Instead of the variate Y we can consider Y / A , i.e. the fraction of A not covered by k C' s, 
and to obtain its mth moment we divide /i^( 7) by A m . Also the quantity dx x dy x . . . dx m dy m jA m 
is the probability of obtaining m centres in the elements of area dx v dy x , ...,dx m dy m of A 
if these centres are uniformly distributed over A. 

We thus obtain the following result from (11): the mth moment of the fraction of A not 
covered by k C’s mth their centres falling at random on T is eiqual to the kth moment of the 
fraction of T not covered by m C’s with their centres falling at random on A and rotated through 
180°. 

In the case k = m = 1 we can express this in a slightly different way if we (i) deal with the 
area common to the two areas concerned, (ii) regard all orientations as possible and as 
equally likely, and (iii) deal with areas rather than fractions. We obtain, in fact, the following 
result: the integral of the overlap of C and A, when the centre of C is taken over T and all 
orientations are permitted , is equal to the corresponding integral of the overlap of C and T,for 
all positions of the centre of C on A and for all orientations. 

In the practical cases with which we shall deal, the area A is always ‘well inside’ T, i.e. 
every point of A can be reached by a C centred somewhere in T. In such cases the formula 



F. Gabwood 5 

for the mean overlap is simple; we have ra =1 and the fraotion of T not covered by one C 
is (T-C)IT, which is oonstant for all (x lt y 2 ), so that its Jcth moment is (T—C) k jT k , i.e. 

M't ( 12 ) 

If the number of C’s follows a probability generating function G(u), the mean is given by 

(13) 


p' l{ Y/A) = G(^y 


For the 2nd moment we are concerned with two C’s centred at (x lt y t ) and (x 2 , y 2 ), and if 
their common area is Q(x v y v x 2 , y 2 ), we have 




(M) 


The 2nd moment n' 2 ( Y /A) is then the expectation of q k or 6(q) for all pairs of points over A, 
and we no longer have a simple formula as in the case of the 1st moment. The overlap £2, 
however, depends on the relative positions of the two C’s, and therefore the number of 
variables in the integration is reduced from 4 to 2 or 1. This is illustrated in the following 
examples. 

5. Circles falling on a fixed square 

Assume A to be a square of unit side (i.e. A = 1), C a circle radius a and T a ‘square with 
rounded comers’, whose boundary is at a distance a outside the sides of A. Thus 

T = 1 + 4a + 7ra a (15) 

and C = no 2 . (16) 

It is seen that the fraction q of T outside the two circles centres (x v y x ) and (x % , y 2 ) is a func- 
tion only of the distance r between these points, and can therefore be written as q(r). Hence 
if <f>(r) is the frequency function of r, we obtain 

Pi(Y) =J^ q k (r)<f>(r)dr. 


or 


/4(Y) = G{q(r))<j>{r)dr. 


(17) 

(18) 


The area £2(r) common to two circles radii a with centres distant r apart is 

£2(r) — 2 a 2 (6 — sin 0 cos d), 
where r = 2o cos 0 [r ^ 2o), 

and £2(r) = 0 (r ^ 2o), 

1 + 4 o — NO a +D(r) 


and 


q(r) = 


(19) 

( 20 ) 
(21) 

( 22 ) 


1 + 4o + 7TO* ’ 

where Q(r) is given by (19), (20) and (21). 

To obtain the frequency function <j>(r) of r, we note that 

r 2 = («! - x a ) 2 + (y 2 - y 2 f, (23) 

where x v x 2 , y x and y 2 are uniformly and independently distributed in the range 0, 1; The 
difference § = | x 2 — x 2 | follows the ‘triangular’ distribution 

tf-2(l-g)d£, (24) 

from £ * 0 to 1, so that the quantity 

u = £* =(x 1 -x,) 2 


follows the distribution 


df 


_ l-*ju 


du. 


(25) 



6 The variance of the overlap of geometrical figures 

Similarly, v = (y x — y t ) 2 follows independently the same distribution 

df= i -^dv. (26) 

The distribution of r = *J(u + v) 

is obtained by integrating the product 

V(««) 

of the frequencies of u and v over that part of the line u + v = r 2 within the square of unit 
side in which the point u , v can lie, and we obtain without much difficulty for the frequency 
function of r, 

= 2r(n — 4r + r 2 ) for 0 < r < 1 , (28) 

and <j>(r) = 2r(4sin _1 1 /r -4- 4 ^(r 2 — l)-r 2 -7r — 2) for l<r<^/2. (29) 

Thus the 2nd moment of the fraction of the unit square uncovered is given by the integral 
(17) or (18), where q(r) is given by (22), (19), (20) and (21) and <j>{r) is given by (28) and (29). 

It does not appear possible to reduce the integral (17) simply to elementary functions, and 
quadrature must be used. The integrand has discontinuities in its first derivative at r = 1 
and r = 2a, so that the integration must be carried out separately over the intervals with 
these as end-points. 


6. Overlap of circles on fixed rectangle* 


We replace the square A of the previous section by a rectangle A ; for convenience we assume 
its sides to be and 1/^6, where b > 1, so that the ratio of the longer to the shorter side is 
6 and the area is unity. The centres of the circles radii a are assumed to be equally likely to 
fall anywhere in a ‘rectangle with rounded corners’ T, whose boundary is at a distance a 
outside A, i.e. 

T = 1 + 7ra 2 + 2a(y]b+ l/y/b). (30) 


To obtain the 2nd moment of the fraction of the area of A not covered, wc calculate an 
integral similar to (17) or (18). The function 

,, T-2na 2 +Q(r) 


is derived from Q(r), which remains the same, but the frequency distribution the dis- 
tance between a pair of points chosen at random in the rectangle, is different. 

The co-ordinates x x and x 2 are uniformly and independently distributed in the range O, 
Jb (if we take Ox parallel to the longer side). The distribution of u = (x l — x 2 ) 2 is thus seen 


from (26) to be 


df = 


while the distribution of v = (y x — y 2 ) 2 is 

df = 


1 —J(u/b)du 


V(“/*j !> 


b<Ju 

(31) 


(32) 


* This problem was solved by Robbins (1945); see footnote on p. 1. 



F. Garwood 


7 


The distribution of r = <J(u + v) is obtained by integrating the product 

(V6->)(i-V(M) 

J(buv) 

over that part of the line u + v = r 2 within the rectangle 0 < u < b, 0 < v < 1/6. This gives 

(f>(r) — 0,(r) = 2r[n— 2r( s /6 + . v /(l/6)) + r 2 ] for r < 1/^Jb, (33) 

<f>(r) = <p t (r) = 2r[2& — \/b — 2r <Jb( 1 — cos a)] for 1 /*]b - 
where a = sin -1 l/r-Jb, 

and <f>(r) = <f> 9 (r) — 2r[2(a — /?) — 6 — 1 /6 + 2r sin /?/,/6 + 2r *Jb cos a — r 2 ] 

for Jb<r<j(b+l/b), J- (35) 

where fi — cos -1 ^/6/r. 

Thus the 2nd moment of Y can be found from (19)-(21) together with (30) and (33)-(35). 


/ > • 

xr<Jb t | 


(34) 


7. Overlap of rectangles on a fixed rectangle 

Assume that the fixed rectangle A has sides a and b and the covering rectangles G have sides 
a and /?. The latter are assumed to be dropped with sides a parallel to the side a and with 
their centres anywhere inside the rectangle T , which is concentric with ab and has sides 
a + a and 6 + ft . To calculate the 2nd moment of the fraction Y/A of A not covered by k C' s, 
we use (18) and calculate the expectation of q(x v y v x 2 ,y 2 ), the fraction of T not covered 
by two O' s with their centres (x v y Y ) and (x 2 , y 2 ) falling at random in A. The area common to 
two C'b is readily seen to depend only on the difference £ of the x co-ordinates of their 
centres and on the similar difference rj of their y co-ordinates. In fact, the area can be 
written as £>(*,, y v x 2 , y t ) = [a - g] (36) 

where the symbol* [x] stands for x when x > 0 and is zero when x < 0, and we obtain 


^2/2) = 1 + 


(37) 


(tt -j- ct) (b -f /?) 

To obtain the expectation of q k , we need the frequency distribution of £ and 7. As in §5, 
£ is readily seen to follow the frequency distribution 


df.^u 

between 0 and a, with a similar distributon for 77, and we obtain the result 

If the A:th power be expanded, the resulting integrals are, with the exception of the first, the 
product of integrals whose upper limits are a' = min (a, a) and 6' = min (fc,/?) respectively. 
We obtain 


(38) 


(39) 


4--JX( 


I , («-£ )(/?-?)- W\ k 


(a + a) (b + Ji) 

4 / 2a/? \kirarb 


J (a- £)(b- V )d£d V 


a 2 b z \ (a + a)(6+/?)j Uo 


{a-i)(b-ri)didy- j* f (a - £) (6 - 1?) dgdJ . (40) 

o Jo Jo ) 


* The writer is indebted to Neyman & Bronowski for this convenient notation (see below). 



8 The variance of the overlap of geometrical figures 

By a simple change of variable we obtain 

'M'- **™^ ^-****-*-*-*™- 11 *' (4,) 

which is the result obtained by Bronowski & Neyman by a rather different method.* 

8. Overlap of circles on a fixed circle 

We now consider a fixed circle A of unit area and therefore of radius 6 = 1 /jn, with k circles 
C of radius a dropped at random with their centres uniformly distributed over a circle T 
of radius a + 6. The 2nd moment of the fraction Y/A of A not covered is, as in §5, the 
expectation of q k (r), where q(r) is the fraction of T not covered by two circles with centres 
falling at random in A a distance r apart. We have 


_ 1 + 2aV*r -na*+Q(r) 
W) ~ 1 + 2 ajn+na* 


■2 ajn+na* ’ 

where Q(r) t the overlap of two circles radius a with oentres apart, is given by (19), (20) 
and (21) as before. We thus need the frequency distribution <f>(r) of the distance between two 
points chosen at random in the circle of unit area to obtain 


f2 Vn 

riLT/A)-J' q k (r)<j>(r)dr. 


(43) 


To do this we use a fairly straightforward geometrical method, finding first the probability 

integral rr 

F(r)= t(r)dr, 


(44) 


which is the probability that the distance between the two random points is less than r. 

The probability that the first point is between v and v + dv from the centre is 2vdv/b 2 , 
while if 

A(v) sb area common to circles radii b and r with centres distance v apart, (45) 
it follows that the probability of the second point being within r of the first i^A(v)jnb 2 . Hence 

•*2 vA(v) 




nb 2 


dv. 


(46) 


Construct the triangle with sides r, b and v , and let the angles opposite to these be 0, <j> 
and Then the following can be readily verified: 

Ifr<6, A(v) sb b 2 0 + r 2 <f> — brainy if b — r<v<b, 1 

= nr 2 if 0<i?<6 — r.J 

If6<r<26, A(v) == 6 a 0 + r 2 ^ — 6rsin^ if r — 6<v<6,| 

= nb 2 if 0<t><r — b.f 

The integration in (46) is carried out by parts, with \{r as the ultimate variable of integration, 
and to do this we obtain the result . , 

(49) 


(47) 

(48) 


Putting 

we obtain, over the whole range of r, 

m 


r = 26 sin Ja, 


a , t . . r cos Aa r*sina 

_ +r . ( „_ a) — .jj. — 


7T 


(60) 

(61) 


* And by Robbins (1946). 



F. Garwood 


0 


while differentiation yields the frequency distribution as 

0(r) - 2r(7r— a)-2^^(4r*-7rr*). (52) 

It will be noted as a matter of interest that the chance of the two random points falling 
further apart than the radius of the circle is 1 — F(b) — or 9/22 nearly.* 

We thus obtain the 2nd moment of the uncovered area from (42), (43) and (52). 

9. Use of probability generating functions 
(i) Binomial 

It is interesting to apply first the binomial distribution of k, the number of C’ s dropped 
uniformly and at random on T. Assume that T contains the centres of all the C’s which 
touch or cover A, and that S is some larger area including T. If l C’s are dropped at random 
on 8, the probability generating function of the number of centres falling on T is 

v (S-T + Tuy 

CiM = l £ j. (53) 

Thus, from the general result of § 4, the with moment of the fraction of A uncovered is the 

I ^ ~ + Tq\ l . . , , r „r m j U„ ... /"f>„ £.11: A 


expectation of 


, where q is the fraction of T uncovered by m C’s falling on A. 


But the expression within brackets is the fraction of S uncovered. The use of the binomial 
generating function is thus verified. 

(ii) Poisson 

The Poisson distribution next suggests itself. If the number of centres follows this dis- 
tribution with a mean of A per unit area, the probability generating function is 

C 2 (u)se AT (“-« (54) 

and the with moment will be expectation over A of 

e mq-V m 

Alternatively, we could write this as 

/C( FI A) = exp (e~ AZ ), (55) 

where Z = area of overlap of wi C’s falling on A. In particular, 

mean value of Y/A = fi[( Y j A) — e~ xc . (56) 

Thus the with moment of Y I A is related to the characteristic function of Z, but this result 
does not appear to be of any theoretical importance: it does not, for instance, throw any light 
on the frequency distribution of YjA. Formula (55) does, however, demonstrate the fact, 
which is otherwise obvious, that the area T does not enter into the frequency distribution of 
the fraction of A not covered by C’s whose fall follows the Poisson distribution. 

As far as the calculation of the variance is concerned, we need to calculate first the 2nd 
moment of e~ xz . In the cases where the falling areas are circles, the area of overlap Z of two 
circles is equal to 2 C—£2(r), where £i(r) is the function given above ((19) etc.) for the area 
common to two C’s with centres r apart. The 2nd moment is thus 


p Je Aa(f) 5l(r) dr , 


where <j>(r) is the frequency function of r, the formulae for which are given above for the 
various cases. 

* The solution to this problem (no. 698), given by Whitworth (1897), oontains an error, resulting in the incorrect 
value of 35/88 nearly. 



10 


The variance of the overlap of geometrical figures 

In the case of rectangles falling on rectangles, the necessary formula for the variance is 
given by Neyman & Bronowski in the form of a series, viz. 

„ , yia \ _ 4e ~ 2a ^ A v 

™ ~ ' aty 2 s t i s ! (s +T)>r+ 2)* 

x {(« + 2) a - a + [a - a] ( 1 - a/a) ,+1 } {(« + 2) 6 - fi + [/? - 6] (1 — 6//?)® +1 } . (58) 

Table 1. Variance of fraction of fixed area not covered by areas G 
falling according to Poisson distribution 


(i) Circles falling on square; (ii) circles falling on rectangle 2x1; (iii) circles falling on rectangle 4x1; 
(iv) circles falling on circle; (v) squares falling on square (sides parallel). 




Mean area not covered 

Size of falling 





area C ~ size 

Case 

0-25 

0-50 

0*75 

of fixed area 







Variance of area not covered 

0-2 

(i) 

0-018(5 

0-0303 

0-0254 


(ii) 

0-0182 

0*0296 

0-0248 


(iii) 

00169 

0-0275 

0-0229 


! (iv) 

00190 

0-0310 

0-0260 


! (v) 

0-0182 

0-0300 

0-0252 

10 

! w 

0-0608 

0-0964 

0-0789 


' (ii) 

0-0573 

0-0904 

0*0743 


1 (iii) 

0-0489 

0*0766 

0-0627 


(iv) 

0-0620 

0-0983 

0-0810 


(v) 

0-0593 

0-0945 

0-0781 

J *8 

(i) 

0-0816 

0-1262 

0-1026 


(ii) 

0*0771 

0-1193 

0-1040 


(iii) 

0-0657 

0-1015 

0*0824 


(iv) 

0*0829 

0-1281 

0-1040 


(v) 

0-0790 

0-1230 

0-1002 


In general, the variance increases in the following order as between the different combinations of shapes : 

(1) Circles on rectangle 4x1, (iii). 

(2) Circles on rectangle 2x1, (ii). 

(3) Squares on square, (v). 

(4) Circles on square, (i). 

(5) Circles on circle, (iv). 

There is an exception in the last case considered, however (mean fraction of area not covered = 0*75, size of 
falling area fixed area — 1*8), when the order is changed somewhat. However, the variances are generally of the 
same approximate magnitude. 

(iii) Contagious distribution 

Neyman & Bronowski have included in their study the case of a contagious law of type A 
with two parameters (see Neyman, 1939). Here the probability generating function is 

G 3 (u) = e m{cAT(u ~ t) - 1 K (59) 

They have pointed out that the expression of this as a series enables calculations to be utilized 
from the Poisson distribution. 

In the general case the 5 th moment of the fraction of A not covered is the expectation of 

e m(e-A2- « (60) 

where Z is the area common to 8 C’s falling with their centres on A . 



F. Garwood 
10. Numerical results 


11 


It is impossible to calculate complete tables covering all cases, but it is of interest to calculate 
a few values for the purpose of comparison, and the following combinations of areas have been 
considered: 

Fixed area Falling areas 

(i) Square Circles 

(ii) Rectangle 2x1 Circles 

(iii) Rectangle 4x1 Circles 

(iv) Circle Circles 

(v) Square Squares (with sides parallel to fixed square) 

In each case the fixed area has been made of unit size, while the falling areas were respectively 
C = 0-2, 1*0 and 1*8. The areas were assumed to fall according to the Poisson distribution, 
the number of centres per unit area being such that the expected fraction of the fixed area 
A not covered was respectively 0*25, 0-5 and 0*75. Since from equation (56) the expected 
fraction of A not covered is m = e~ AO , the relations between A and C for the 9 combinations 
of m and C are A <7 = l 0 g c ^ log 6 2 and log e 4/3. 

The 2nd moments were determined by quadrature from formula (57), where the falling areas 
were circles, and by direct evaluation of the series (58) for the case of squares on squares. 
The results are given in Table 1. 

1 1 . Experimental investigation 


Before the work of Robbins andNeyman & Bronowski was brought to the notice of the 
writer, an attempt was made to obtain experimentally a general formula for the variance of 
the fraction of area not covered. Attention was confined to the case of circles falling on 
squares, the centres of the former being chosen randomly (by means of random numbers) 
within the area T whose boundary is at a distance a outside the sides of the unit square A . 

For each combination of V and Ic, samples of up to 200 in size were drawn. The various 
combinations were as given in Table 2. 


Table 2. Ranges of k and C covered in experimental determination of variance 


Area of circles dropped _ ^ 
Area of square 

No. of circles = lc 

00077 

5, 10, 15 

0031 

5, 10, 15, 20, 30 

0033 

5, 20, 40, 80, 120 

0-25 

1, 2, 4, 6 

1 

1,2, 4,6 


Three methods were used to measure the fraction of the fixed square not covered in each 
sample. Method P involved the measurement of the covered area by planimeter. Method 
L utilized a photoelectric cell to measure the amount of light passing through a glass plate 
on which black paper disks had been stuck. Method C consisted of a simple count of squares 
on graduated paper, and generally this was the most convenient to operate. (The two neigh- 
bouring values of C were used to compare methods L and P.) 

For each combination of C and k the average fraction not covered, F, was compared with 
the theoretical value m = ( 1 — C/T) k . (61) 



12 The variance of the overlap of geometrical figures 

The observed standard deviation of the observations being s, the appropriate criterion for 
testing the mean is t _ F — m 

w 

P being the number of observations in the sample. The results are given in Table 3. 

Table 3. Comparison between observed and expected values of fraction of area not covered 


P * planimeter method. L = photoelectric method. C — oounting method. 


Area of circle 
Area of square 
* C 

No. of 
circles 
k 

No. in 
sample 

P 

Mean area not covered 

Standard 

deviation 

8 

Deviation t 
y — m 

Method 

of 

measure- 

ment 

Observed 

Y 

Expected 

m 

0 0077 

5 

100 

0-9094 

0-9683 

0-0055 

2-0000* 

P 

0-0077 

10 

100 

0-9394 

0-9370 

0-0088 

2-0464* 

P 

0-0077 

16 

100 

0-9105 

0-9079 

0 0090 

2-8889t 

P 

0-031 

5 

100 

0-8924 

0-8901 

0-0222 

— 1*0007 

P 

0-031 

10 

200 

0-8007 

0-8030 

0-0334 

-0-9739 

P 

0-031 

15 

100 

0-7134 

0-7190 

0-0404 

-1-6347 

P 

0-031 

20 

100 

0-0349 

0-0449 

0 0423 

-2-3041* 

P 

0-031 

30 

100 

0-5057 

0-5179 

0-0493 

-2-4740* 

P 

0-033 

5 

100 

0-8853 

0-8901 

0-0250 

— 5-9200f 

L 

0*033 

20 

100 

0-0320 

0-0278 

0-0500 

0-9000 

L 

0-033 

40 

100 

0-3909 

0-3941 

0-0367 

0-7029 

L 

0*033 

80 

75 

0-1017 

0-1553 

0-0358 

1-4750 

L and P 

0-033 

120 

50 

0 0057 

00012 

0-0255 

1-2478 

P 

0*26 

1 

30 

0-8910 

0-8950 

0-0824 

-0-2200 

C 

0-26 

2 

110 

0-7853 

0-8009 

0-1214 

-1-3477 

C 

0-25 

4 

110 

0-0203 

0-0415 

0-1481 

- 1-0704 

C 

0-25 

6 

110 

0-4891 

0-5138 

0-1307 

-1-9820* 

C 

1-0 

1 

20 

0-7583 

0-7053 

0-2350 

-0-1332 

C 

1-0 

2 

50 

! 0-5890 

0-5856 

0-2340 

0-1025 

a 

1*0 

4 

50 

0-3801 

0-3430 

0-2334 

1-3057 

c 

1-0 

6 

50 

0-1080 

i 

0-2008 

0-1630 

-1-3908 

c 


* Between 6 and 1 % levels. f Beyond 1 % level. 


An examination of the values of t shows that too many of them are outside the 5 % 
significance levels, while in each set corresponding to one value of C the values are too 
frequently of the same sign. The worst deviation is for C = 0-033 and k = 5, with t = — 6-92, 
but this only corresponds to a difference between the observed mean of 0*885 and the 
theoretical mean of 0-890. The test is thus very sensitive and the deviations are not serious, 
and they arise from imperfections in the technique which have not been investigated in detail. 

As regards random errors of measurement as distinct from bias, it was not possible to 
carry out a systematic estimation of the contribution of this source to the total variation. 
A series of repeated measurements for the case C = 0-031, k = 30, for which the observed 
mean was 0-51, showed that the individual measurements had a standard eiror of about 
0-009. As the total standard deviation in this case was 0-049, the true estimate of the standard 
error (i.e. omitting the error of measurement) was sJ(0-049 2 — 0*009®) = 0-048, and for our 
purposes this difference is negligible. There is thus some evidence for assuming that this 
method of estimating the variance was satisfactory. 

12. Derivation of empirical formula for the variance 
The consideration that the theoretical variance <r % of the fraction uncovered must be 
small whenever the mean m of this fraction is near the limits of its range, zero or unity, 
suggests that we might try the relation 

<r a ~m(l — m) 



Table 4. Derivation of empirical formula 


F. Garwood 


13 


1 

’.a 

|% 

| | | | | | | | | | | | |*£*^*£*a 

J-j 

IP 

^p«o>«weoo> 

1 I 1 1 1 I! 1 1 1 1 1 1 

Exact 

*8 8 
i 

tr 

icStS&SSSg 

1 1 1 1 1 1 ii 1 1 1 1 ii S55 |g|g 

oooooooo 

Empirical 
value of 
variance 


00000363 
00000692 
0-0000990 
0000684 
000116 
0-00148 
0-00168 
0-00184 
0-000785 
. 0 00188 

0 00192 
0-00105 
♦ 0000461 
0-00736 
0-01249 
0-0180 
0-0196 
0-0471 
0-0636 
0-0590 
00420 

WIN 

gS 

X 

2-10 

2-38 

2-51 

2-41 

2-05 

Mean 

values of g 

-9(C) 

0-00108 

0-00759 

0-00876 

0-0820 

0-234 


£ 

1 

s 

II 

ooooooooooooooooooooo 

Observed 

variance 

0-0000303 

0-0000774 

0-0000810 

0-000493 

0-00112 

0-00163 

0-00179 

0-00243 

0-000625 

0-00250 

0-00135 

0-00128 

0-000650 ' 

0-00679 

0-0147 

0-0219 

0-0171 

0-0552 

0-0550 

0-0545 

0-0266 

|s> 


No. of 
circles 

Jfc 


Area of circles 

!<a 
*8 ii 

1 

I> t"* fc""* • 

£8S8?$8?3SS?S , 8aa8ooo9 

d>o*©ooooooo©©oooo>-*>irt-< 




=8 2 /m(l-m) 


14 


The variance of the overlap of geometrical figures 



I 2 4 6 8 0 20 40 60 80 00 200 

T/O 

Fig. 2. Derivation of empirical formula for variance. 



F. Garwood 


16 


for given C. Accordingly we have calculated the quantity 

a* 


(62) 


and the results are given in Table 4. 

For each value of C the values of g are by no means constant, but the variation is not 
excessive, and it is considered that for practical purposes we can take g to be a function 
of C, at least over the range considered. To obtain a suitable form for this function, it was 
decided, for very general reasons, to seek a simple relation between g(C) and TjC, the latter 
being roughly the number of C’a which could be placed on T if they could be fitted together 
without overlapping. 


Table 6. Comparison of empirical formula for variance vnth exact value 


<r* = exact value, erf = empirical value, E — percentage error = 100 (af -- <r*)/<r a . 


nd 1 


i 

i 

^ i 

II 

r 

II 

= 3 

ii 

Jk = 6 

k*= 6 

0-2 

<r a 

0 00501 

0-00873 

0-0114 

0-0132 

0-0144 

0-0150 


°i 2 

000616 

0-00896 

0-0117 

41*0135 

0-0147 

0-0164 


E 

+ 30 

+ 2-6 

+ 2-6 

-ni-3 

+ 2-1 

+ 2-7 

0-4 

a 1 

00156 

0-0246 

0-0290 

0-0305 

0-0300 

0-0274 


°1 2 

00149 

0-0237 

0-0284 

0-0304 

0-0305 

0-0294 


E 

-4-5 

-3-7 

-2-1 

-0-3 

+ 1*7 

+ 7-3 

0-0 

or 2 

00277 

0-0403 

00447 

0-0427 

0-0395 

0*0339 


°i a 

00257 

0-0384 

0-0431 

0 0433 

00407 

0-0370 


E 

— 7-2 

-4-7 

-3-6 

+ 1-4 

+ 3-0 

+ 9-1 

0-8 

O' 2 

0 0398 

0*0541 

0-0552 

0-0501 

0-0428 

0-0361 


V 

00365 

0-0517 

0-0551 

0-0525 

0 0471 

0*0407 


E 

-8-3 

— 4-4 

-0-2 

+4-8 

+ 10-0 

+ 16-0 

10 

or 2 

0 0508 

0-0651 

0-0628 

0-0540 

0*0436 

0-0339 


<T\ 

0 0471 

00636 

0-0648 

0-0590 

0-0507 

0-0420 


E 

-7-3 

— 2-3 

+ 3-2 

+ 9-3 

+ 16-3 

+ 23-9 

1-2 

tT 2 

0 0605 

0-0737 

0-0677 

0-0554 

0*0427 

0-0317 


V* 

00571 

0-0740 

0-0724 

0-0635 

0-0525 

0 0419 


E 

-5-0 

+ 0-4 

+6-9 | 

+ 14-6 

+ 22-9 

+ 32-5 

1*4 

or 2 

0-0089 

0-0804 

0-0706 

0-0554 

0*0410 

0-0292 


erf 

0-0667 

0-0832 

0-0786 1 

0 0665 

0-0531 

0-0411 


i 

-3 2 

+ 3-5 

+ 11-3 | 

+ 20-0 

+ 29-5 

140-7 

1-6 

O' 2 

0-0763 

0-0855 

0-0723 

0-0546 

0*0389 

0-0267 


°1 2 

0-0757 

00913 

0-0834 

0-0684 

0*0530 

0-0398 


E 

-0-8 

+ 6-8 

+ 15-4 

+ 25-3 

+ 36*3 

+ 49-1 

18 

<T 2 ! 

0-0828 

0-0895 

0-0730 

0-0531 

0*0367 

0-0244 


<rf 

0-0843 

0-0985 

0-0873 

0-0695 

0*0524 

0-0383 


i 

+ 1*8 

+ 10-1 

+ 19-6 

| 

+ 30-9 

+ 42*8 

+ 57*0 


Fig. 2 shows the result of plotting the observed mean value of g(C) against TfG on log- 
arithmic scales. The points (the values of s 2 /m(l — nt) for various values of k are plotted in 
addition to the mean), lie reasonably close to a straight line of slope —1*5, indicating the 
relation g ^) ~ ( GjT )*. (63) 

The values of (T/C)*g(C) are shown in Table 4, where the values are seen to lie between 2 
and 2*5 with an average of 2*3. Thus we derive the rough empirical formula 

„ _ 2-3m(l-m) | 

1 (T/C)* * 

m = (1-C/T). 


where 


( 64 ) 



16 


The variance of the overlap of geometrical figures 

These are given in Table 4, together with the values in some eases of the percentage error of 
erf compared with the true value a* obtained from the methods described in §5; the per- 
centage error in the estimate a 2 observed experimentally is also given. (It was not possible to 
evaluate a* in all cases, as the computation is somewhat laborious.) Another set of com- 
parisons between the empirical and the exact formula is given below in Table 5, over the 
range G = na 2 from 0*2 to 1*8 and k = 1, 2, 3, 4, 5, 6. 

It will be seen that the empirical formula gives quite a satisfactory fit, e.g. with an error 
less than 10 %, over a considerable part of the range studied, but that the error tends to 
increase, i.e. the formula exaggerates the variance, as C and k increase. 


13. Use of thb^ empirical formula for the Poisson case 

If k circles are dropped with their centres falling at random on the area T the mean area not 
covered can be written as m(k) = (1 — C/T) k , 

and the empirical formula for the variance as 




2-3ra(&) [1 — m(&)] 
(TIC) * 


Table 6. Comparison of empirical formula for variance of area not covered 
in case of circles falling on square according to Poisson distribution 


Size of falling 
area C ~ 
fixed area 


Mean area not covered 

m 

0-25 

0*5 

0*75 

1 

0-2 

er» 

00186 

0*0303 

0*0254 


o-i* 

0*0106 

0*0308 

0*0257 


% error 

54 

1*7 

1*2 

10 


0*0608 

0*0964 

0*0789 


| *1* 

0*0669 

00981 

0*0781 


% error 

10*0 

1*8 

-1*0 

1-8 

o- 8 

0*0816 

0*1262 

0*1026 


<r i 1 

0*0942 

0*1348 

0*1057 


% error 

154 

6*8 

3*0 


Hence if k follows the Poisson distribution with expectation XT , the total variance based 
on the empirical formula is 

*1= i e - ~ AT - (y - T ? - [m*(k) + H {k)] - m\ 
fc-o 

where m = expected area not covered 


= e -AC 

We find after expanding m 2 {k) that 
erf = to 2 £| 


i 23 \ 

1 (TIC)*}™ 




2-3to 


(65) 


( 66 ) 


(W 

This formula is compared with the exact values in Table 6 over the same range as in Table 1 . 



17 


F. Gi»W0od 

The agreement is again reasonably satisfactory over the greater part of the range, large 
positive errors occurring for large values of C and small values of m. These errors might be 
reduoed by using a constant rather smaller than 2*3 in the empirical formula, but the point 
has not been investigated further. 

Summary 

The mathematical study of bombing has given rise to the following problem. A fixed 
outline, such as a square or circle, is drawn on a plane, and other similar outlines are dropped 
at random on it. Estimates are then required of the variance of the fixed area which is not 
covered. Work by Robbins enables a theoretical formula to be derived, and Bronowski 
& Neyman have treated, by an independent method, the special case of rectangles falling 
on rectangles. 

It is shown that in the case of circles falling on circles, squares or rectangles, the variance 
can be expressed as the integral with respect to r of the product of two functions, one being 
a simple function of the area of overlap of two circles with centres r apart, and the other 
being the frequency function of the distance r between two points chosen at random in the 
c covered ’ area. This applies both to the case where the number of falling areas is fixed and 
where it follows a Poisson distribution. Numerical values have been calculated for a 
number of cases. An experimental method had been carried out prior to the above theo- 
retical work, and the following empirical formula was derived for the variance of the 
fraction of a fixed square not covered by k circles, area C, falling at random on an area T 
containing centres of all C’a which cover or touch the fixed square : 

q 2-3m(l— m) 

* ~ (T/C)i~ 9 

where m = mean fraction of area not covered 

= (l-C7T) fc . 

This formula, and its extension to the Poisson case, have been shown to be in reasonable 
agreement with the exact values over a considerable range. 

The writer is indebted to Miss G. 0. Jeffcoate for valuable assistance in the computing 
and experimental work. 


REFERENCES 

Bronowski, J. & Neyman, J. (1945). Ann. Math. Statist . 16, 330. 

Neyman, J. (1939). Ann. Math . Statist . 10, 35. 

Robbins, H. E. (1944). Ann. Math. Statist. 15, 70. 

Robbins, H. E. (1945). Ann. Math. Statist. 16, 342. 

Whitworth, W. A. (1897). DOC Exercises in Choice and Chance. Cambridge. 


Biometrika 34 


2 



[ 18 ] 


A STUDY OF A FIRST DYNASTY SERIES OF EGYPTIAN SKULLS 
FROM SAKKARA AND OF AN ELEVENTH DYNASTY SERIES 

FROM THEBES 

By A. BATRAWI, Ph.D. and G. M. MORANT, D.Sc. 

1. Introduction. This paper deals with forty -four male crania of 1st dynasty date 
(c. 3400 b.c.) discovered at Sakkara by Macramallah Effendi, who has published a report 
on the excavations (1940), and with fifty-five crania of 11th dynasty (c. 2000 b.c.) soldiers 
unearthed at Thebes in 1927 by the Egyptian Expedition of the Metropolitan Museum of 
Art, New York (Winlock, 1928). The cemetery at Sakkara, 20 miles south of Gizeh, was used 
by the middle classes of the local community. Prof. D. E. Derry has kindly provided the 
following notes on it : 

The 1st dynasty cemetery at Sakkara excavated by Macramallah Effendi is of special interest. 
Comparatively few cemeteries of this date have been found, and, while the total number of forty -four 
skulls from which reliable measurements could be taken was small, yet the results yielded by these 
are such as to show that we are dealing with a race which differs in important features from those 
exhibited by the so-called predynastic people. 

The observation that there were two races in Egypt in the early dynastic period was first made in 
the year 1909, when the results of measurements obtained from a series of male and female skulls of 
the 4th and* 5th dynasties from the great necropolis surrounding the pyramids of Giza came to be 
examined and compared with crania from early predynastic graves. Until then the theory of an 
unbroken evolution of the Egyptian race from prehistoric times right tlirough the dynastic period had 
been taught. It now became obvious that the culture which we know of as peculiarly Egyptian was 
associated with a race which could not have been derived from the predynastic people. The introduction 
of stone- working resulting in the erection of great tombs and statuary, as well as beautifully executed 
reliefs, paintings and above all writing, all pointed to a race far in advance of the predynastic people, 
who although skilled in the making of bowls and vases in stone as well as in pottery, and who had 
already attained to the discovery of the uses of copper, were, nevertheless, little removed from the 
Neolithic period. 

The cemetery is unusual in consisting entirely of males. In the note on the skulls published in 
Macramallah Effendi’s report it is stated that there were some females included in the collection. After 
the report had gone to press Macramallah Effendi informed the writer that a part of the cemetery 
was of 18th dynasty date. It turned out that all the female skulls came from this part and that therefore 
the 1st dynasty cemetery contained only remains of males. Dr Hatrawi’s examination of the figures 
confirms the statement made at the beginning of this note and shows the closeness of the relationship 
of the people of the 1st dynasty at Sakkara with those of the 4th and 5th dynasties from Deshasheh 
and Medum. 

In his report on the discovery of the series of 1 1th dynasty skeletons at Thebes Mr H. E. 
Winlock (1928) says that they were found in ‘a tomb in the row where the grandees of 
Mentuhotep’s court had been buried’. He remarks: ' 

Obviously what we had found was a soldiers’ tomb. To judge from the cheapness of their burial they 
were only soldiers of the rank and file, and yet they had been given a catacomb presumably prepared 
for the dependents of the royal household, next to the tomb of the chancellor Khety. Clearly that was 
an especial honour. If we are right in supposing that all had been buried at once, they must have been 
slain in a single battle. 

Prof. Derry examined the bodies on the spot, and he took measurements of the crania 
and of some of the long bones, About sixty bodies were counted and all proved to be those 



A. Batrawi and G. M. Morant 19 

of adult mal^s who had died in the prime of life. Prof. Derry says that the skeletons were 
reburied after they had been examined. 

2. The measurements of the crania . The Sakkara series was sexed and measured by 
Prof. Derry and we are indebted to him for allowing us to use his records in this paper. 
All the absolute measurements, given in Appendix II, are his readings with the exception 
of those of the foramen magnum , which were taken by one of the writers (A.B.) of this 
report. The measurements of the Thebes series were also kindly provided by Prof. Derry, 
together with means he had calculated. The readings for individual crania are given in 
Appendix III. 

The technique of measurement followed by Prof. Derry is that of the Monaco Congress 
(Duckworth, 1913). He had used this when measuring the predynastic Egyptian series of 
skulls from Badari, of which part was remeasured later in London by Miss B. N. Stoessiger 
(1927), who followed the biometric technique. The two sets of measurements of the same 
fifty-three specimens have been compared (Morant, 1935), thus showing in detail what 
relations are to be expected between readings obtained by following the two techniques. 
These results were taken into account in preparing the definitions of Prof. Derry’s measure- 
ments given in Appendix I below. The characters are denoted as far as possible there and 
in the tables by the customary index letters of the biometric technique. 

3. The nature of the two series. Mean measurements and standard deviations for the two 
series are given in Table 1. The longest series of Egyptian skulls measured, known as the 
E series, came from a cemetery at Giza used from the 26th-30th dynasties (Davin & 
Pearson, 1924). Judging from comparisons of constants for a number of cranial characters, 
most of the other ancient Egyptian series described exhibit almost precisely the same order 
of variation as the one from Giza. In general they have been found to be rather less variable 
than European cranial series, while there is no evidence that there was any appreciable 
change in the variation exhibited by Egyptian populations during the long period from 
early predynastic to Roman times. 

The two new series are shorter than several from Egypt previously described. Counting 
the number of characters for which the standard deviation for one series is greater or less 
than the corresponding constant for the other series, the situation is : 

Sakkara and Thebes: Sakkara s.d. greater for nine and less for ten characters; 

Sakkara and Giza: Sakkara s.d. greater for four and less for eleven characters; 

Thebes and Giza: Thebes s.d. greater for eight and less for nine characters. 

This crude comparison suggests that there can have been no marked differences between 
the variabilities of the three populations represented. As sets of differences are considered, 
the limit of significance accepted may be taken considerably higher than in the case of 
a single difference. Suppose that there is a real distinction if two of the standard deviations 
differ by an amount which is 3*5 or more times its probable error. Then one significant 
difference is found for the Sakkara and Thebes series ( NH , L , Sakkara s.d. greater, 
d/p.E.d = 3*8), none for the Sakkara and Giza, and three for the Thebes and Giza series 
( H' 4*1, S x 3*5, $ 2 3*5, Giza s.d. the greater in all three cases). The two new series are too 
short to give reliable comparisons, but the evidence suggests that the populations they 
represent were equally homogeneous, while both were rather less mixed in racial com- 
position than the 26th-30th dynasty population of Giza. 



20 Egyptian skulls 

4. Comparisons of mean measurements. Following biometric practice, it may be supposed 
in such a case that no statistical analysis of the series can reveal its racial components. 
The relationships of the series have to be judged by comparing them as wholes, on the basis 
of mean measurements, with other series known to exhibit unexceptional variation. It 
was Bhown by Morant (1925) that the recorded series of ancient Egyptian skulls can be 
divided into two groups. These were called, for convenience, the Upper and Lower Egyptian, 


Table 1. Means and standard deviations ( with probable errors) of the Sakkara dynasty 
and Thebes 11 th dynasty series of male skulls 


Character* 

Means 

Standard deviations 

Sakkara 

Thebes 

Sakkara 

Thebes 

L 

186-9 ±0-56 (41) 

181-8 ± 0-53 (54) 

5-31+0-40 

5-75+0-37 

B 

138-7+0*41 (43) 

138-3 + 0-41 (54) 

3-99 + 0-29 

4-52 ±0-29 

B' 

96-6 ±0-39 (36) 

93-6 + 0-43 (55) 

348 ±0-28 

4-72 + 0-30 

H' 

135-4 ±0-67 (32) 

137-1 + 0-37 (51) 

5-63 + 0-47 

3-91+0-26 

[Aur. ht.] 

114-8 + 0-67 (27) 

5-20 ±0-48 

— 

LB 

102-7 + 0-57 (29) 

100-7 + 0-34 (46) 

4-57 + 0-40 

3-40 + 0-24 

U 

518-8 ±1-5 (29) 

507-4+1-3 (50) 

12-0 ±1-1 

13-2 +0-89 

Si 

— 

125-7 + 0-47 (52) 

— 

501 ±0-33 

s t 

— 

129-4 ±0-56 (53) 

— 

6 00 + 0-39 

S , 

— 

115-2 + 0-82 (51) 


8-63 + 0-58 

8 

— 

370-5 + 1-2 (50) 


12-7 +0-86 

[Broca’s Q '] 
fml 
jmb 

36-7 + 0-26 (31) 

300-3 + 0-84 (49) 

218+019 

8-72 + 0-59 

30-4 + 0-22 (29) 

— 

1-74 + 0-15 

— 

\G'H ] 

71-9 + 0-55 (30) 

72-0 + 0-37 (45) 

4-43 + 0-39 

3-71+0-26 

OB 

96-5 + 0-62 (25) 

95-5 + 0-48 (38) 

4-57 + 0-44 

4-40 + 0-34 

J 

127-8 (14) 

127-6 + 0-52 (32) 

— 

4-33 + 0-36 

[NH, L] 

51-2 + 0-50(29) 

51-8 + 0-25 (45) 

4-00 + 0-35 

2-52 + 0*18 

NB 

25-4 + 0-21 (30) 

25-0 + 0-20 (42) 

1-70 + 0-15 

1-92 + 0-14 

[Oil 

38-9 + 0-24 (26) 

39-1+0-16 (44) 

1-84 + 0-17 

1-55 + 0-11 

[0 2 ] 

32-5 + 0-20 (26) 

33-1+0-23 (44) 

1-50 + 0*14 

2-23 + 0-16 

[Prosthion GL] 

99-6 + 0-56 (26) 

96-5 + 0-47 (43) 

4-23 + 0-40 

4-61+0-34 

100 B/L 

74-2 + 0-26(39) 

76-1+0-26 (54) 

2-44 + 0-19 

2-84 + 0-18 

100 H'/L 

100 BJH' ‘ 
100 fmbjfml 

72-8 + 0-41 (30) 

75-5 + 0-29 (51) 

3-37 + 0-29 

3-06 + 0-20 

102-6 + 0-60(31) 

100-8 + 0-41 (51) 

4-95 + 0-42 

4-34 + 0-29 

83-3 + 0-72 (29) 

— 

5-76 + 0-51 

— 

[100 Q'H/QB] \ 

74-3 + 0-50 (25) 

75-8 + 0-53 (38) 

3-70 + 0-35 

4-83 + 0-37 

[100 NB/NH , L] 

49-5 + 0-58 (29) 

48-3 + 0-48 (42) 

4-63 + 0-41 

4-61+0-34 

[100 OJO /] 

83-6 + 0-51 (26) 

84-6 + 0-49 (44) 

3-86 + 0*36 

. 

4-81+0-35 


* The characters are defined in Appendix I. A symbol in square brackets denotes either that the measurement 
is one not usually included in the biometric technique, or else that Prof. Derry’s method of taking the measure- 
ments doeB not accord with biometric practioe. 


though there is evidence that the regions represented changed somewhat with time. The 
series in the first group came from the neighbourhood of Thebes and sites farther south, 
while those in the second group came from the same region of Upper Egypt and sites 
farther north. The first group includes all the predynastic series that have been described 
and some of dynastic date, the latest being of the 18th dynasty: the second group ranges 
from the 1st dynasty to Roman times, though no series available earlier than the 4th 
dynasty had come from the region immediately south of the Delta. The Sakkara series 
described in the present paper extends the range of such material back to the 1st dynasty. 



21 


A. Batrawi and G. M. Morant 

It had been found that the means for all these series are almost constant for most of the 
metrical characters commonly recorded, but for a few measurements more significant 
differences are found and these separate the two groups of series. Characters of both kinds 
are treated in Table 2, which is based on Table XIII in Risdon’s paper (1939) on the human 
remains from Lachish (Palestine). The first six characters are those which make the clearest 
distinction between the Upper and Lower Egyptian types of series, and they are all breadths 
or dependent on breadths — the latter being the horizontal circumference and the two 
indices — of the cranium. The Sakkara series is clearly assigned to the Lower Egyptian 
group, and if counted as a member of this the range of the mean minimum frontal breadths 
(B') for the group is slightly extended. The Thebes series is also assigned to the Lower 
Egyptian group by four of the six characters in question: for U and 100 B/H', however, 
its means fall within the ranges given for the Upper Egyptian type of series. 


Table 2. Ranges of mean measurements for two groups of series of ancient Egyptian 
male crania and means for the Sakkara and Thebes series * 


Series 

Period 

B 

J 

B f 

U 

Upper Egyptian type 
j Sakkara 
j Thebes 

Lower Egyptian type 

| 

Early predyn.~18th dyn. 
1st dyn. 

11th dyn. 

1st dyn.-Roman 

131-4—134*3 (10) 
138-7 

138 a 3 

135-3-139-3 (9) 

123-6-127-6(8) 

127-8 

127-6 

127-5—131-3 (8) 

i 

90-4-92-8 (4) 
96-5 

93-6 

930-96-2(6) 

i 

600-0-610-4(4) 

518-8 

607-4 

510-6-618-7 (5) 

i 


Series 

Period 

100 B/L 

100 BIH' 

L 

1 T 

Upper Egyptian type 
Sakkara 

Thebes 

Lower Egyptian type 

Early predyn.-18th dyn. 
1st dyn. 

1 1th dyn. 

1st dyn.-Roman 

71-7-73*7 (10) 
74-2 

76-1 

73-7-76-0 (9) 

981-1011 (10) 
102-6 

100-8 

102-3-106-4 (9) 

182-2-185-2 (10) 
186-9 

181-8 

181-4-185-8 (9) 

132-4-135-9 (10) 
135-4 

137-1 

130-7-136 0 (9) 


* The characters are defined in Appendix I. The numbers in brackets give the numbers of series to which the ranges 
relate. In the case of these previously described series the smallest number of crania on which any one of the means is 
based is 16, though this minimum number is about 30 for most of the characters. The numbers on which the Sakkara 
and Thebes means are based can be seen from Table 1, the only one less than 26 being 14 for the bizygomatic breadth 
( J) of the Sakkara series. 


The last two characters in Table 2, whioh are the length and height of the cranium, fail 
to distinguish the two contrasted groups of series. The means for the two new series fall 
outside the ranges previously given by all the ancient Egyptian material, the Sakkara 
series giving the greatest L and the Thebes the greatest H' . The evidence of other characters 
must be taken into account, but so far the comparisons suggest that the two new series 
are of the Lower Egyptian type, and it is to be expected that they bear a closer resemblance 
to some of the series assigned to that group than to any other cranial series. 

At the same, time it may be noted that the Sakkara lgt dynasty and Thebes 1 1th dynasty 
populations are clearly differentiated by their mean cranial measurements. There are twenty 
characters in Table 1 for which means for both series are available. The most significant 
difference is for L, and it is 6-6 times its probable error, while five other characters — B', U, 




22 Egyptian skulh 

prosthion GL, 100 B/L and 1 00 H’jL — also show differences which exceed four times their 
probable errors. 

5. Comparisons by coefficients of racial likeness . The method of Karl Pearson’s coefficient 
of racial likeness has been applied extensively to series of ancient Egyptian crania. Risdon 
(1939) has given comparisons made in that way for twenty -two male series, including three 
from sites outside Egypt, and the treatment below is almost restricted to comparisons 
between these and the two new series described in the present paper. The procedure fol- 
lowed in applying the method described in several papers in Biometrika was adopted without 
modification.* 

In deriving a classification of a number of cranial, or living, series from the coefficients 
of racial likeness found between them, it has been shown repeatedly that the most sug- 
gestive arrangement is obtained if the closest resemblances of the series, indicated by 
coefficients below a certain value, are alone taken into account. Risdon has given a diagram 
(1939, Fig. 3) showing all the reduced coefficients less than 5-0 between the twenty-two 
series with which he dealt. There are fifty-three of this lowest order among the 231 
( = 22 x 21/2) comparisons. The addition of the two new series to the classification referred 
to only requires a knowledge of the reduced coefficients less than 5*0 between them and 
the twenty-two series. 

It has been pointed out that inspection of a few mean measurements can indicate whether 
a comparison of two particular series would almost certainly give a reduced coefficient 
greater than the limit chosen, or whether it might provide a value less than 5 0. The 
measurements used for this rapid test are six which are known to be those which show the 
most significant differences, and the greatest proportions of such differences, in comparisons 
of the group of series. These are the length, breadth and height of the brain-box and the 
three indices derived from these chords. For the fifty-three comparisons of the twenty -two 


* A ‘crude* coefficient is defined by 


j 8 [ x 

m [n 9 + n / 


(M'-M, 


- 1—1 ±0-6745. /- ■, 
J V ra 


where M a is a mean based on n s crania for the first series, M/ and n' are the corresponding constants for the 
second series and m characters are compared. The rr’s of the long 26th-30tb dynasty Egyptian series were 
used throughout. The crude coefficient may be written 


1 5(a) — 1 ±0-6745 . / 1 2 , where a 
m \ m 


x (M'- M/)* 


Its value is largely determined by the sizes of the two samples that happen to be available, if in fact they do not 
represent the same population. As many excavated crania are damaged to some extent, in the case of a particular 
series means for different characters will usually be based on various numbers of specimens (see Table ] ). The 
mean number available for the characters used is denoted by ri H in the case of the first series and by n s ' in the 
case of the second series, and these ‘sizes’ of the samples are usually unequal and may be of very different orders. 
To obtain, as far as possible, a measure of the absolute divergence of the types compared which does not depend 
on the numbers of crania available, a ‘reduced’ coefficient of racial likeness is commuted. This is defined to be 


100 x 100 

166 + loo x 



1± 0-6745 



A reduced coefficient may be supposed a good approximation to the value which would be obtained if all the 
means for both series were for 100 individuals instead of for the numbers actually available. If a crude 
coefficient differs from zero by less than 3*5 times its probable error — a rare occurrence— then it is supposed 
that there is no evidence of a significant distinction between the two populations represented. In this case 
there is no need to compute a reduced coefficient. Otherwise, reduoed coefficients are found and the classification 
of a number of series is based on these. 



A. Batrawi and G. M. Morant 23 

series giving reduced coefficients less than 5-0, the maximum differences for the six 
oharaoters (in mm. or units of the indices) are : 

L B H' 100 BjL 100 H'lL 100 B/H' 

31 30 3-5 20 2*3 2-8 

To avoid the danger of missing comparisons which might be of the order required, in 
applying the test each of these values was increased arbitrarily by 0*2 giving: 

L B H' 100 BjL 100 H'jL 100 B/H' 

3*3 3*2 3-7 2-2 2-5 30 

In comparing a new series with the twenty-two it may be supposed that a reduced 
coefficient of racial likeness greater than 5*0 would almost certainly be found if the dif- 
ference between the means is greater than the accepted limit in the case of any one or more 
of the six characters. For such comparisons the coefficients were not calculated. If the 
differences between the means are less than the limits for all six characters then a reduced 
coefficient less than 5-0 might be found: the coefficients were calculated in all such cases. 
In this way detailed comparisons were judged to be required between the new Sakkara, 
1st dynasty, series, on the one hand, and six of the twenty-two treated by Risdon on the 
other; and between the new Thebes, 1 Jth dynasty, on the one hand, and only two of the 
twenty -two series on the other. The previously described series involved in these two sets 
of comparisons — one series being included in both sets — are : 

(i) Deshasheh and Medum, 4th and 5th dynasties (Thomson & Maclver, 1905). The two 
towns are south of Sakkara and both less than 40 miles from it. 

(ii) Gizeh, 26th-30th dynasties (Davin & Pearson, 1924). 

(iii) Sedment, 9th dynasty (Woo, 1930). 

These three and the new Sakkara series are all from Lower Egypt among the total 
twenty-four series referred to above. All the other Egyptian sites mentioned are in Middle 
Egypt and close to Abydos and Thebes. 

(iv) Abydos, 18th dynasty (Thomson & Maclver, 1905). 

(v) Abydos, 1st dynasty, royal tombs (Morant, 1925). 

(vi) Lachish, Palestine (Risdon, 1939). This series represents an Egyptian population. 
It is assigned to the seventh and eighth centuries b.c., though it is not well dated. 

(vii) Tigr6 district, Abyssinia, modern (Sergi, 1912, means given in Morant, 1925). 

(viii) Cretans, modern (von Luschan, 1913, means given in Woo, 1930). This series is 

not one of the twenty-two dealt with by Risdon. It was included because of its close 
resemblance to the new 11th dynasty series from Thebes. The test based on a comparison 
of the means of the six calvarial measurements shows that the only ancient Egyptian series 
Vhich might give reduced coefficients less than 5*0 with the Cretan series are the Theban 
11th dynasty and the Sedment series ((iii) above). 

It must be emphasized that a reduced coefficient of racial likeness less than 5*0 represents 
a very close degree of resemblance. Values of that order have only been found between 
cranial series which would be expected, on account of their provenance, to represent the 
same or closely related populations. There is a danger that low reduced coefficients may be 
misleading owing to the influence on them of extraneous factors, such as inaccuracy in 
sexing or slight and unappreciated differences between the methods of measurement of 
two recorders working independently. It is safe to suppose that the two new series are 
made up entirely of the crania of adult males. In computing coefficients with them care 



24 


Egyptian shills 

was taken to restrict a particular comparison to pairs of means based on measurements 
obtained by following precisely the same technique. 

Owing partly to that restriction, the numbers of characters that could be used in com- 
puting coefficients with the new series are decidedly smaller than the 31 used ideally for 
the purpose. For these comparisons the smallest number of characters used is 9 and the 
largest number is 18.* Risdon (1939, pp. 131-2) has examined the matter experimentally 
and he concluded that use of a smaller number of characters — the set of 14 he considered 
being very similar to the sets we were able to use — can usually be expected to give a fairly 
close approximation to the result which would be obtained from about twice as many 


Table 3. Coefficients of racial likeness between ancient Egyptian, a Palestinian ( Lachish ) 
and modern series of male skulls from Abyssinia and Crete * 


Series 

Crude 

O.R.L. ± P.E. 

Reduced 

C.R.L. 

Sakkara, 1st dyn. (32*1) with Deshasheh and Medum, 4th and 5th dyn. (46 0) 
(32*1) with Abydos, 18th dyn. (49*9) 

(31 *6) with Lachish (249*3) 

(31-6) with Gizeh, 26th-30th dyn. (885-7) 

(31*6) with modern Abyssinian (61*4) 

(31-0) with Abydos, 1st dyn. royal tombs (33-6) 

(30*3) with Thebes, 11th dyn. (46*7) 

019 ±0*32 (9) 
— 0*08 ± 0*32 (9) 
1*04 + 0-25(14) 
2-45 + 0-25 (14) 
2-43 ±0-25 (14) 
1*91 + 0-25 (14) 
4-26 ±0-22 (18) 

1*85 ±0*45 
4*02 + 0*41 
5*82 + 0-60 
5-87 + 0-77 
11 *45 ±0-59 

Thebes, 11th dyn. (49-2) with Sedment, 9th dyn. *(37*9) 

(49-0) with modern Cretans (50*4) 

(48*3) with Deshasheh and Medum, 4th and 5th dyn. (46*0) 

—0*43 + 0*28 (12) 
201 + 0-30(10) 
2-23 ±0-32 (9) 

4-04 ±0*60 
4*73 ±0-68 

Sedment, 9th dyn. (37*5) with Deshasheh and Medum, 4th and 5th dyn. (39*9) 
(37-7) with modern Cretans (47-9) 

1*88 + 0*25 (14) 
2-71 ±0-25 (15) 

4*86 + 0*65 
6*42 ±0-59 


* The numbers in brockets following the names of the series are the mean numbers of crania for the characters 
used in computing the coefficients. The numbers in brackets following the crude coefficients are the numbers 
of characters on which they are based. Woo (1930) gives coefficients with two of the series in the table above, 
and the values there differ from his because they were recalculated omitting the term 1/m, which was discarded 
after 1930. The standard deviations of the long E series of 26th-30th dynasty crania from Gizeh (Davin & 
Pearson, 1924) were used in computing all the coefficients in the table. 

characters. Occasionally, however, use of a smaller number of characters may suggest 
a rather misleading conclusion, and it will tend to indicate a rather wider separation of the 
series than that which would be found if all 31 characters could be used. With these reserva- 
tions in mind the coefficients with the new series may be accepted as the best approxima- 
tions it is possible to obtain in the circumstances. ’ 

All the coefficients of racial likeness found with the Sakkara 1st dynasty and Thebes 
11th dynasty series are given in Table 3. Fig. 1 is a reproduction of part of a diagram given 
by Risdon (1939, p. 137) for the twenty-two ancient Egyptian and related series with which 
he dealt, with the addition of the two series described in the present paper and that of 
modern Cretans. The Sakkara series is seen to be an unexceptional member of the ‘Lower 
Egyptian ' constellation, having two insignificant coefficients and other dose connexions 

* The characters common to all the comparisons with the new series are L 9 B t H\ LB t J, NB t 100 B/L, 
100 H'/L and 100B/H'. Others used in some cases are B\ U, S f fml, fmb and 100 fnib/fml, and for the coefficient 
between the two new series only Q'H , NH, L, O/, O k , 100 O'H/OB , 100 NB/NH, L, 100 



A. Batrawi and Cr. M. Mobant 


26 


with members of that grou|). On the other hand, the 11th dynasty soldiers from Thebes 
clearly represent an Egyptian population of an aberrant type. The direct comparison fails 
to distinguish this from that of the 9th dynasty series from Sedment. Woo (1930) had 
found that the latter stands apart from all the other Egyptian series, and he pointed out 
that the Sedment bears a closer resemblance to a series of modem Cretans (von Luschan, 
1913) than to most of the ancient Egyptian series. The Thebes 11th dynasty series has 
a reduoed coefficient less than 5-0 with the Cretan, though the latter has no other coefficient 
of this order with any of the other Egyptian series. 

6. The racial history of ancient Egyptian populations. The new evidence makes rather 
more preoise the racial classification of ancient Egyptian populations given in earlier 
oraniological papers in Biometrika. The 1st dynasty series of crania from Sakkara is the 
earliest in date that has been described representing the region immediately south of the 
Delta. It is an unexceptional representative of that group which must have prevailed 

ln*igni6cant 

Significant and < 3*5 — — — — — 

% 3-5-5 0 


Connexions 
with ‘upper 
Egyptian' 
series 
(early 
predynastic 
to 1 8th dyn.) 



. ^.Laehish-^.^ 


T 


.1 


I 


Abydos 


j» Royal Tombs 


;t 

(1st dyn.)' 

* \ x 
\ % 


* 


\ •'/ 

Abydos 


Sakkara 

(1st dyn.) 

L 


Thebes . 

(Uth.dyn.) 



C v ■. ,| 

s — — ~Deshaiiheh 


. \ ^ /10 l , . " ,mm & Medum /. 

: \ / Vi l ® ,h <"- ) (4.h&5.hdyn.): 

' •^'•Deodfrab — — — S$Thetx»^ x \ 


, Sedment 
(9th dyn.) 


(Ptolemaic)' 
Denderah 
* •(Roman)"* 


(18th-20th dyn.) 

f 

/ 

; / 

Thebes 


’Gixeh 

(26th-30th dyn.) 
A^ydos 


_ • 

Jv* • <18th-21.. dya.) x , 8th . 19lh Hyn.) 

Abyssuuan ^ — • * 

(Modern)** 


Cretans 
. • (Modern) 


Fig. 1 . Reduced coefficients of racial likeness between the two new and other ancient 
Egyptian and related series of male crania. 


in the region — with only slight local and secular variants — from the 1st to the 30th 
dynasty and probably in both earlier and later periods as well. Such populations are said 
to be of ‘Lower Egyptian’ type. 

In the region to the south, round Thebes and Abydos, the population was of a second 
racial type from the earliest predynastic (Badari) epoch for which there is any adequate 
craniological evidence. This is called the ‘Upper Egyptian’, though it would be better 
to call it Southern Egyptian. The population became modified slowly down to some 
time about the 18th dynasty. The change was such that the ‘Upper Egyptian’ type 
of population came to bear a closer and closer resemblance to the ‘Lower Egyptian’, 
though the two groups remained clearly distinct. About the 18th dynasty there must have 
been a fairly rapid, if not abrupt, change in the racial composition of the population of the 
Thebes and Abydos region. Nearly all the series from there, of that and later dates, are not 
of ‘Upper’ but of ‘Lower Egyptian’ type. They diverge slightly from the populations of 
the region immediately south of the Delta, however, in the direction of the ‘Upper 
Egyptian’ type. Six of these series — viz. those from Abydos, Thebes and Denderah of 




26 


Egyptian skulls 

dates ranging from the 18th dynasty to Roman times — are shown in Fig. 1. The 1st dynasty 
series from royal tombs at Abydos, also shown there, is an exoeption on acoount of its date. 
The obvious explanation of its peculiar position is that it represents an intrusive and more 
or less isolated community which was derived from the other centre of population to the 
north. 

This accounts for twenty-two of the twenty-four series of crania considered. The classi- 
fication of these does not seem to necessitate reference to any non-Egyptian peoples. 
This is not so, however, in the case of the remaining two series, viz. the new one of 11th 
dynasty soldiers from Thebes and the 9th dynasty series from Sedment (Deltaic region). 
These two might represent the same population as far as can be seen from the direct 
comparison, and both stand apart from the 4 Lower Egyptian ’ constellation of series (see 
Fig. 1). The fact that the 11th dynasty series from Thebes has a close resemblance to one 
of Cretans, which is of modern date, suggests that the two aberrant communities in question 
may have been derived from the crossing of ancient Egyptians with people from some 
European or Asiatic source. 

The mean basio-bregmatic heights (//'), cephalic indices and height-length indices are 
higher for the 11th dynasty Thebes and Sedment series than for any other of the series 
considered. The types, defined by average measurements, of these two thus diverge from 
that prevailing in ancient Egypt in the direction of the ‘Armenoid’ type. Elliot Smith 
(1911 and elsewhere) supposed that intrusive ‘Armenoid' aliens played a considerable 
part in modifying the population of the country and that ‘long before the time of the 
New Empire, Egypt was permeated from one end to the other with this foreign element'. 

Our interpretation of the evidence fails entirely to support this hypothesis. There is no 
need to suppose that any people foreign to the country played a substantial part in 
modifying its population from predynastic to Roman times. The communities represented 
by the 11th dynasty Thebes and Sedment series may possibly have been derived from the 
crossing of Egyptian and ‘Armenoid’ people, but they stand apart. The remarkable point 
is not that two out of twenty -four populations should be peculiar in that way, but that the 
remaining twenty -two show interrelationships which do not suggest any admixture with 
alien stock. They can readily be explained on the supposition that there was a steady 
transference of population from the Deltaic region to the region of Thebes and Abydos, 
where the population was originally of a somewhat different type, from early predynastic 
times to the 18th dynasty. About that time the movement must have been accelerated, 
and thereafter the populations of the two centres were almost indistinguishable in racial 
type. The racial history of ancient Egypt was of a simple kind. 

7. Summary and conclusions. This paper deals with forty-four male crania of 1st dynasty 
date from Sakkara and with fifty -five crania of 11th dynasty soldiers from Thebes. Indi- 
vidual measurements taken by Prof. D. E. Derry are given in appended tables. Judging 
from the rather small samples, the two populations represented exhibited the same order 
of variation, while both were rather less mixed in racial composition than the population 
of Giza from the 26th-30th dynasties. Mean measurements clearly differentiate the two 
new series from one another. Judging from characters considered singly, both series bear 
a close resemblance to some other ancient Egyptian series, and both are of ‘Lower’ rather 
than ‘Upper Egyptian’ type. Comparisons are made by the method of the coefficient of 
racial likeness, though decidedly fewer characters than the standard set of thirty-one used 
when possible are available for the purpose. The resulting relationships are shown in Fig. 1. 







APPINDH in, INDIVIDUAL MEASUREMENTS OF THU CRANIA OF ELEVENTH DYNASTY SOLDIERS FROM THEBES* 




A. Batrawi and G. M. Morant 37 

The Sakkara let dynasty series, which is the earliest from the region immediately soath 
of the Delta, is an unexceptional member of the ‘ Lower Egyptian ’ constellation, and it oan 
be supposed to typify the population of Northern Egypt at the time, The 1 1th dynasty 
series of soldiers from Thebes is linked to the same group, but it diverges from it. The 
type is indistinguishable from that of a 9th dynasty series, from Sedment. The former also 
has a Jink with the type of a series of modern Cretans. The two aberrant communities of 
Thebes and Sedment must be supposed to have been derived from the crossing of anoient 
Egyptians with people from some European or Asiatio source. Our knowledge of the racial 
history of ancient Egypt derived from oraniologioal evidence is reviewed. 

REFERENCES 

Davin, A. C>. ft Pit arson, K. (1024). On the biometric eons tants of the human skull. Biometrika, 16, 328-68. 
Duckworth, W. L. H. (1913). International Agreements for the Unification (a) of Craniometric and Cephalometric 
Measurements, (6) of Anthropometric Measurements to be made on the Lining Subject. Camb. Univ, Prow. 
Lvschan, F. von (1013). Beitr&ge cur Anthropologic von Kreta. Z. Ethn. 14, 307-08. 

Macramallah, R. (1040). Un CimUiert Archaique de la Classe Moyenne d Saqqmra . Imp. Nat. le Cairo. 
Morant, G. M. (1925). A study of Egyptian craniology from prehistoric to Roman times. Biometrika, 17, 1-52. 
Mokant, G. M (1935). A study of prodynastic Egyptian skulls from Badari based on measurements taken by 
Miss B. N. Stoessiger and Prof. D. E. Derry. Biometrika, 27, 293-309. 

Risdon, D. L. (1939). A study of the cranial and other human remains from Palestine excavated at Tell Du weir 
(Lachish) by the Wellcome-Marston archaeological research expedition. Biometrika, 31, 99-169. 

Skroi, 8. (1912). Crania Habessiniea. Contribute ) all' Antropologia dell’ Africa Orientate. Rome: Loesoher. 
Smith, G. Elliot (1911). The Ancient Egyptians ami their Influence upon the Civilization of Europe. Harper. 
Stokssiokr, B. N. (1927). A study of the Badarian crania reoently excavated by the British School of 
Archaeology in Egypt. Biometrika, 19, 110-50. 

Thomson, A. ft Randall-MaoJvrr, 1). (1905). The. Ancient Races of the. Thebaid. Oxford Univ. Press. 
VViNixK'K, H. E. (1928). The Egyptian Expedition, 1925-27. Bull. Met. Mus. Art, New York. 

Woo, T. L. (1930). A study of seventy -one ninth dynasty Egyptian skulls from Sedment. Biometrika, 22, 65-93. 


Appendix I. Definitions of measurements 

Individual measurements of the two series of crania are given in Appendioes II— III. The 
contractions used there and in tables in the text to denote characters are: 

L = maximum glabella-occipital length. B - maximum horizontal breadth. B' « mini- 
mum frontal breadth. //' = basio- bregmatic height. Aur. ht. = ‘vertical height from line 
joining highest points of external auditory meatuses \ LB = basion to nasion. U «* maxi- 
mum horizontal circumference above the superciliary ridges. 8 X = arc nasion to bregma. 
&) = arc bregma to lambda. S 3 — arc lambda to opisthion. 8 = total sagittal arc from 
nasion to opisthion. Broca's Q' = transverse arc from 4 the most prominent point on the 
posterior root of the left zygoma, exactly above the auditory aperture’, to the same point 
on the right passing through the bregma, fml = basion to opisthion. fmb = maximum 
breadth of foramen magnum. O'H = nasion to alveolar point. OB = facial breadth between 
lowest points on zygomatico-maxillary sutures. J = maximum breadth between zygomatic 
arches. NH, L = nasal height from nasion to point furthest removed from it on the 
margin of the left pyriform aperture. NB = maximum breadth of the pyriform aperture. 
0[ = breadth of right orbit from the dacryon. O t - maximum height of right orbit. 
Prosthion OL - basion to prosthion. 



C 28 3 


THE GENERALIZATION OF ‘STUDENT’S’ PROBLEM 
WHEN SEVERAL DIFFERENT POPULATION 
VARIANCES ARE INVOLVED 


By B. L. WELCH, B.A., Ph.D. 


1 . / nlroducAion and summary. Let y be a population parameter which is estimated by an 

* 

observed quantity y, normally distributed with variance <r*. Let cr* = where the 


t-i 


are known positive numbers and the of are unknown variances. Suppose that the observed 
data provide estimates «f of these variances, based on J t degrees of freedom, respectively, 
so that the sampling distribution of «f is 





(1) 


and that these estimates are distributed independently of each other and of y . 

A very simple particular case of this set-up occurs when we have samples of n x and 
respectively, from two normal populations with true means oc l and a a and standard devia- 
tions <r t and <r a . If // is the true difference (a^ - a a ) between the means, the estimated difference 
is y «= The variance of the estimate is <rj = (A^f + A 2 cr|), where A x = ljn x and 

A a * I jn r The estimated values of erf and cr\ are s\ = Z x jf x and si — 2T a // 2 , where Z x and 
are the respective sums of squares of observations from the individual sample means and 
/j m (n, - 1) and / a ~ (n a ~ 1). These s 2 are distributed in the form (1) and the postulated 
conditions of independence hold. 

Another particular case, again with k = 2, arises when we wish to compare two regression 
coefficients, fitted to independent sets of data, without making the assumption that the 
population residual variance about the true regression line is the same for both sets. 

The present paper is written mainly with these practical applications of the case k — 2 
in mind, but the results are expressed generally for any k , since no further analytical diffi- 
culties are involved. It will be showm how probability statements about y , considered as 
an estimate of y, may be made similar in character to those which W. S. Gosset derived for 
the mean of a single sample of n observations (‘Student’, 1908). We shall, in effect, seek a 
quantity A, calculable from the observations, with the property that the chance of the 
difference (y-y) falling short of h is a given probability P. It is dear that A must be a 
function of the individual variances sf and of P. If the abbreviation Pr. is used to mean 
4 the probability of the relation in the bracket following’, our problem is to satisfy the 
equation 

Pr.[(y- 7) <*(«?,«!, ...,*1,^)] - P. (2) 


In Gusset’s case the solution was, of course, simply 

Pr. [(*- a) <*,,«/>] = P, (3) 

where l P is the value, corresponding to the probability level P, in the ‘Student’ < -distribution 
with /=(»-!) degrees of freedom. 



B. L. Welch 


29 


In the next section the mathematical derivation of the exact solution of (2) u given. This 
is then followed by some consideration of its expression in numerical terms. First, a series 
solution in powers of \ff t is developed, which may be used for calculating tables. Then some 
comparisons are made with a non-series approximate solution which is based on a particular 
way of regarding the distribution of a quantity of the general form » * (2«<x?). 

Some brief discussion is then added which may serve to place the present contribution 
in its proper relationship to other papers which have been written on this topio. 

Finally, it is shown how the inequality (2) may be adapted to provide an interval estimate 
for y. 


2. Mathematical derivation of solution. Let ...,«f ,P) denote the probability that 

(y — y) is less than A(«f, s|, P), given s? (i = 1 , 2, .... k). Then, since y is distributed quite 

independently of the estimated variances, we have 






J. 




V(27t)‘ 


»«'du 


.(A(sf,s|, ...,s|, P)1 

\ 7?) j’ 


( 4 ) 


where 1 is used to denote the normal probability integral. The condition of equation (2) is 
then simply that, if ...,s\,P) is averaged over the probability distributions of sf as 

given by (1), the result will equal P. Thus 


j\ . . J m , «? si p) n P (si) dsf - p. (5) 

Now we may expand j(s\, s§, P) about an origin (crf,<r|, ...,of) in a Taylor expansion. 
Thus 

j(sl 4, P) = exp[2"(sf-o-?)a ( lj(u> 1 ,M) s ,...,M.vP), (6) 


it being understood that the exponential is to be expanded in a power series in S { and that 
d t - is to be interpreted so that 

d r t j(w v ?/- g , ...,w k , P) = «’*< •••<«’*> _' t 


J Wj 

On making the substitution of (0) into (5) our result may be written 

9j(w v w z ,...,w k ,P) = P, 

where 6 = H |*exp f (#? - of ]p(«?) d#?. 

Now, substituting into (0) from (1), the integral comes out in simple form, i.e. 


ex p[-ofd,;i 

= exp j - Ibf?, - J2J, log 1 1 - 

I oft)? 4 ..of 3? oftf j 
= exp 2-V +zh « +22 k- +etc.' 


f< 

,of0? 

fi 


3 ff 


ff 




( 7 ) 


(8) 

(») 


( 10 ) 



30 A generalization of ‘ Student's' problem 

Substituting (4) into (H) we have finally 

|A(w„u >,, . /w k , J»)j = p ( 11 ) 

I v (2Af<77) ) 

This, in a very condensed form, is the solution to our problem. 511 The operator 0 constitutes 
a direction to carry out the partial differentiations indicated by ( 10). w j must then be equated 
to a). The solution of the resulting equation will give A(o'J,cr|, P) and therefore the 

required h(s\ y s\, P). 

3. The development of the series solution. It will l>e convenient to write h(w) for 
h(w v w v ..., w k < P) and £ for the normal deviate such that /(£) « P. We may then expand 

l[ ^ ,, ! in a Tavlor series about £ as origin. Thus 


/( u evi ,n *<») _fi /,!/(„) 
Uu’/'/o'?)) 1 Lwa*v?> tj ()> 


(12) 


it being understood t hat the exponential is to bo expanded in powers of Z), and that these 
powers are to be interpreted so that 

"'<"1 -[£'<">ly < l3 > 


Equation (11) then l*ecomeH 

e 




(U) 


Thin may now bo solved by successive approximations. 

The initial approximation is the large-sample normal approximation 

*•(*») (1 8) 

and we may write 

h(w) * Ai(ir) f h 2 (w) + etc., (It>) 

where A,(/o) includes terms of order I /f i% h 2 (w) terms of order 1 Iff and so on. For the moment 
we shall treat terms of the order 1 ;ff as negligible. Then (14) gives 


e< 


" 7) 

•!][' •••]«»* - «• 

Or, using (10), 


The equation of the first order term to zero gives 


A,(^ 2 ) 


• e, 


£(! + «*) 


l* V ) 


(19) 


( 20 ) 


* Kqimtion (II) can also be expressed as an integral equation and this form may be necessary for 
providing numerical values where the f i are very small. 



B. L. Welch 


31 


This can then be substituted in the seoond -order term which, when equated to zero, will 
give The prooees may obviously be extended to higher orders, although the expressions 

become so complex that a slightly different procedure has then been found to be preferable. 
To terms of order 1//J our solution is 


A(«*) = 


, .(i + P A /T/ (I+F H /W 
. . 4 (iXrf)* 2 -am* 

, (3+ap + fi«)r /? / ( 16 + 32g® + 9£ 4 ) \ f { ) 

3 (2’A ( +{) 3 32 (2’A ,«?)« . 


( 21 ) 


It may be noted that in the particular case k = 1 , this reduces, as it should, to the already 
known expansion of the deviate of the straightforward ‘Student’ distribution (Fisher, 
1941, p. 161), viz. 


-i 


(l + £V(3+16£* + 6£*) 

+ 4/ + 96/* + ' 


4 


(22) 


It is proposed in another communication to give tables of h(s *) based on the expansion 
(21) carried to some further terms. 


4. Discussion of a non -series approximation. It will be recalled that in Cosset's original 
approach to the single sample problem (‘Student’, 1908) his initial step was to note that the 
first four moments of the distribution of s 2 were consistent with the assumption that the 
distribution could be represented by a Pearson Type ITI curve. He was fortunate? in this 
way to rediscover a distribution which had already been found by Helmert, as this permitted 
him to go on to the derivation of the /-distribution. In our present case, as in many others 
arising naturally in statistical work, we are led to consider, instead of s 2 , a linear function 
I'AjSj of several s\. If this linear function were distributed in a Pearson Type III distribution 
a whole range of new problems could be dealt with by well-established theory. However, in 
general, we do not have this good fortune. For JEA^f is of the form yf, where a { = A*crf//f, 
and the distribution of this quantity is only of Tvj>e III if all the a h except one, are zero, or 
if all the a t happen to be equal. 

Nevertheless, for practical purposes an approximation to the distribution of using 

a Type III curve with start, mean and variance suitably adjusted, can still be useful. In two 
previous papers (Welch, 1980, 1938) I have employed this method to obtain numerical com- 
parisons of the merits of different statistical procedures, where full calculations with the 
true distributions would have been unduly laborious. The method of determining the con- 
stants in the approximation was given for the case k ~ 2 in the first of these paj>ers and is 
as follows. 

If z = + and the approximate distribution curve is written in the form 

then making the first two moments of (23) agree with the true moments of z, we find 


f= >/i + */*)’ «% + &■/. 

7 «‘7i + *>*/.’ 9 ~ af 1 + bf t - 


(24) 


Phrasing the matter rather differently, we can say that z/g is approximately distributed as 



32 


A generalization of ‘Student's’ problem 


X 1 with degrees of freedom /. Of course f, given by (24), will in general be fractional, but the 
letter used to designate this quantity was chosen, and the term ‘ effective degrees of freedom ' 
has been used, because by doing so we can appeal immediately to a considerable body of 
further theoretical results. 

In particular we can say that the criterion 

jj= ^~V) (25) 

follows approximately the ‘Student’ < -distribution with degrees of freedom 

f sb 

(26) 

7 . h 

More generally, when k is not restricted to 2, the same line of argument leads us to say 
that the criterion (y~v) 

’ ~ V(£A,»S| (27) 

in approximately distributed a s ‘Student’s ’ t with degrees of freedom 


Not knowing then*/s in (28), there are several ways in which we may now proceed, depending 
on what weight we may l>e willing to attach to any vague a priori notions we may possess 
of their relative magnitudes (cf. Welch, 1938). If wo are not willing to assume anything, 
perhaps the best choice is 

/- 7 A ,,/' <“> 


It may be shown that the numerator of (29) has, in repeated samples, an average value 
(i’AjdrJ)*, and the denominator has average value i’Afrrf//,-. In a certain sense, therefore, 
(29) is a fair estimate of (28). 

To sum up, then, the interpretation of// as an estimate of r/, using the present type of 
approximation involves only the reference of the criterion (27) to tables of the ‘Student’ 
distribution, entered with degrees of freedom given bj r (29). 

Some further light is now' thrown on this procedure by the expansion for the exact solution 
of our problem derived in the preceding section. For the implications of referring v to the 
‘Student’ distribution may be seen by substituting / from (29) into the expansion (22) of 
the ‘Student’ deviate. On doing this and then expanding in powers of 1 )f i it is found that, 
in effect, our approximation corresjKmds to assuming that 



whereas, in fact, the true solution is given by (21). Comparison shows that we have exact 



B. L. Welch 


33 


agreement to terms of order \jf i and in the first of the quadratic terms. To the second order 
the difference between the expressions in square brackets in equations (21 ) and (30) is 



This difference vanishes if any one of the is overwhelmingly larger than all the others, or 
if sj is proportional to/JA*. It appears that, in general, the difference is not likely to be large. 
We have, therefore, found some justification for using the Type III approximation in the 


present case. 


The above comparison has been made on the basis of the series developments, but it should 
be borne in mind that approximations based on positive frequency functions, such as those 
falling under the Pearson system, usually provide a higher degree of accuracy than might 
appear from any consideration of expansions. Furthermore, they are apt to give an insight 
into the nature of the situation which may sometimes be lost in working out the details of 
exact solutions. In the present case 1 feel that the comparison of this section serves to give 
added confidence in the exact solution,* which I have put forward in the previous two 
sections, quite as much as it demonstrates the value of the approximate method. 


5. Further discussion, in comparing the present contribution with other work on the 
subject, the essential point to notice is the averaging process involved in equation (5), We 
are not trying here to make probability statements valid for fured but are averaging over 
the joint probability distribution of the sf, taking into account, therefore, the different 
values which can arise by chance in sampling from populations with fixed rrf. 

I his averaging over t he joint distribution of the a* is parallel to the step taken in 
Section III of (* onset's original memoir (1908) where, in effect, he starts with the distribution 
of t for samples withy?. red s and then averages over the distribution of s which he has already 
derived earlier. He thus arrives at the unrestricted distribution oft (or, more strictly, of a 
quantity z , which is equal to t multiplied by a constant). This distribution forms the basis 
ot the significance tests which he illustrates in his Section IX and of the method of deriving 
interval estimates for the population mean which he outlines in his Section VIII. 

In t he present paper the parallelism with Gusset's work may be obscured to some extent 
by the fact that we do not from the outset seek the probability distribution of some pivotal 
quantity like /, explicitly expressed. It so happens that we are able to proceed to a method 
of deriving an expansion for the required probability level without making explicit reference 
to such a quantity. Nevertheless there remains the important resemblance with Gosset's 
development, in that we do not confine ourselves to samples with fixed s\. 

I his procedure stands in sharp contrast to the formulation of the problem of comparing 
two means, favoured by R. A. Fisher (e.g. 1941) and H. Jeffreys (1940). These writers 
prefer a solution which they ascribe initially to W. U. Behrens (1929). Looked at from one 
point of view, Behrens’s paper appears to contain some gross algebraical errors. Fisher and 
Jeffreys, however, develop lines of argument by means of which they claim that Behrens’s 
solution is quite justified. It seems to me difficult to say how far (if at all) any of these 
arguments may have been in Behrens’s mind when he wrote his pajier and I shall not 
attempt to elucidate this question here. We may, however, permit ourselves one observation 
about the developments according to Fisher and Jeffreys. 

* hxrtct in the sense that it is independent of the irielevant population parameters erf. 

Bioraetrika 34 . 



34 A generalization of * Student 9 *' problem 

Both these writers, at Home stage, limit the field of their probability inferences to a sub- 
set in which the s J are regarded a* fixed. In order to solve the problem on these lines Jeffreys 
introduces an a •priori distribution function for the unknown <r iy following his general 
philosophy for dealing with such questions. Fisher, on the other hand, arrives at the same 
answer by a special utilization of w hat he terms the fiducial distribution of <r { . 

Jeffreys’# approach here does not raise any new issues to those w ho are familiar with the 
general body of his researches on statistical inference. Fisher’s justification of Behrens’s 
solution is }>erhaps of more immediate interest as it raises controversial points which are 
important more sjiecifieally in relation to our present topic of discussion. For although 
Fisher s approach has been very much criticized by a number of writers, starting with 
M. S. Bartlett (193H), the critics have not wished to throw doubt on the whole body of 
results which Fisher includes under the heading of fiducial inference. The criticism has been 
for the most part selective, directed mainly at the way in which so-called simultaneous 
fiducial distributions of several parameters have been defined and manipulated. 

1 have, myself, quite definite views on these questions (particularly on the usage of the 
word ‘fiducial’) but do not feel that I need express them at any great length here. I dis- 
agree with Fisher, but this divergence of opinion must already have become apparent in 
the way I have defined the field within which I make my probability inferences about r/. 
It ap(>earH to me to l>e quite artificial to restrict our view to one which, even in a limited 
sense, fixes it is true that, in the two-sample problem, we have to draw' our inferences 
from the unique pair of samples observed, or, more precisely, from the statistics x v ir 2 , s\ 
and si w hich they provide. These statistics are our only data for the purpose of making 
inferences, hut we add something to these* data in the interpretation w hen we regard the 
samples as being drawn randomly from hypothetical normal populations. Once having 
embarked on this method of interpretation, we should stick to it consistently throughout. 
The sampling variations of s\ should be taken into account only by a direct use of the 
probability distributions as given by our equation (1) and not by any inversion such as is 
involved in Fisher s conception of the fiducial distribution of cr\. As we have seen, it is 
quite possible to make probability statements about the difference between the jiopulation 
means without making any reference whatever either to inverse probability or to fiducial 
distributions. 

The distinction lietwean the procedure which Fisher advocates and one which averages 
over the s 2 distributions has, of course, been stressed by most of the wTiters w r ho have 
contributed papers on the subject, from whatever viewpoint (e.g. Bartlett, 1930, p. .506, 
and Yates, 1939.) VVImt has been lacking hitherto, however, is a solution, analogous to 
(fosset’s single sample solution, which makes complete use of the information contained 
in the data provided. Bartlett indicated one particular way in which probability inferences 
about the difference l>etween two population means might be made, but was careful to point 
out that the problem of making the best possible inferences (in the < theoretical sense of 
utilizing all the information in the data to its full extent) was still an open one. There has 
indeed been some doubt expressed w hether a fully satisfactory solution from this point of 
view^ existed at all. I believe, however, that the one I advance above in equation (11), 
and develop in equation (21), meets all the requirements that one can reasonably expect. 

Whatever conclusion the reader may come to on these matters, however, he will probably 
wish to know how r , in the numerical details, this solution will differ from that of Behrens. 
This will l>e more easily seen when some tables become available, but fortunately certain 



B. L. Welch 


36 


comparisons can already be made. For Fiaher (1941, p. 155) has provided a series ex- 
pansion of the Behrens solution. In our notation, and with = 2, thin may be written, to 
order l/f it as follows : 

/A?*i 

\lx fj 


h(s*) — f ^(A 1 s} + A,#J) 


1 + 


(1+$*) 


. 4 — A - 

(Ai^i 4* A # $5)* 




A 1 A t <}<| 

(AisJ + A ,*!) 1 


(32) 


Even to this order, this differs from our equation (21 ) in the inclusion of an extra term. In 
other words, although the two solutions are the same when samples are large enough to 
adopt the large-sample normal approximation, they differ immediately we take into aooount 
the first corrective term, i.e. they differ as soon as we begin to attach any importance to 
‘ Studentization ’. 


6. An interval estimate for y. We have shown in §§2 and 3 how to calculate a value 
A(«f , «{, P), depending on the observed variances s|, . . . , 8 l> such that the probability 

is P that (y-y)< h(s], «f , . . . , sf , P). This provides a method of testing the consistency of an 
observed y with a prescribed value y. 

When the question is not whether any particular given y is contradicted by the data, but 
rather one of estimating y and at the same time of providing a measure of the uncertainty 
of the estimate, the further step required is immediate. For, as in the case of a single sample, 
the order of the words in our probability statement can be changed so that it becomes— the 
probability is P that y is greater t han {y - h(s\ , «*, P)}. An interval estimate for y is then 

obtained by taking two levels P, and P t for P. Thus the probability is (P, - P a ) that y lies 
between {y - h(e \ , s| *|, ^i)) an<1 {y ~ ? . «!,•••. 4< **)}. 

If P t = (1 — Pi) the range will be symmetrically placed about y. Thus, for example, if 
Pj = 0-95 and P a = 0-05, the chance will be 90 % that y lies within the range 


y± 1-6449 vU’A,*?) 1 + 


1 ) 1 


1 +(1-6449)* 


iff) 

(*w 




(33) 


It may be noted, incidentally, that this range is always narrower than similar ranges 
calculated from Behrens’s solution. 


REFERENCES 

Bartlett, M. S. (1936). The information available in small sample**. Proc. (Jamb. Phil. Soc, 32, 560-6. 

Behrens, W. U. (1929). Ein Beit rag znr Fehlerberechnung bei wenigen Beobachtungen. Landw. Jb. 
68, 807-37. 

Fisher, R. A. (1941). The asymptotic approach to Behrens's integral, with further tables for the d test 
of significance. Ann. Eugen Lond. t 11, 141- 72. 

Jeffreys, H. (1940). Note on the Behrens -Fisher formula. >4nn. Eugen Land ., 10, 48-51. 

‘Student’ (1908). The probable error of a mean. Biometrika , 6, 1-25. 

Welch, B. L, (1936). Specification of rule* for rejecting too variable a product, with particular refer- 
ence t-o an electric lamp problem. Roy. Statist. Soc . Suppl. 3, 29-48. 

Welch, B. L. (1938). The significance of the difference between two means when the population 
variances are unequal. Biomelrika , 29, 350-62. 

V r ATES, F. (1939). An apparent inconsistency arising from tests of significance based on fiducial dis- 
tributions of unknown parameters. Proc . Comb. Phil. Soc, 35, 579-91. 


3-a 



[ 36 ] 


THE DISTRIBUTION OF KENDALLS r COEFFICIENT OF RANK 
CORRELATION IN RANKINGS CONTAINING TIES 

By G. P. SILLITTO, Pii.I)., B.Sc., Research Department , I.C.I. ( Explosives Ltd.) 

A new coefficient of rank correlation has been described by Kendall (1938, 1942, 1943) and 
denoted by him as r. Thin coefficient has advantages over Spearman’s p in respect of the 
smoothness of its distribution and the rapidity with which it approaches normality, thus 
facilitating significance testing, and in being readily adapted to cases of partial rank 
correlation. 

The distribution of r has been worked out by Kendall (1938, 1943) for cases in which 
neither ranking contains members which are graded equal, i.e. rankings containing no 
‘ties’. It is the purpose of the present paj>er to deal with cases, which frequently arise in 
practice, in which ties occur in one of the two rankings. The method is a generalization of 
that of Kendall and will be given in some detail for the case of tied pairs, while the results 
of further generalization to multiple! ties will be indicated without detailed proof, which can 
in all cases be effected simply on the lines indicated. 

Definition of t for rankings (containing ties 
In counting the ‘score’ of a pair of rankings, by the methods suggested by Kendall, each 
member is compared w ith the other members of t he same ranking, and additions to or sub- 
tractions from the score are made depending on whether it is smaller or greater in each case. 
Jf some mem tiers are ranked equal then it is proposed that no change be made in the score 
in comparing them. This obviously accords with the intuitive aspects of ranking. Thus in 
the pair of rankings following, the score is -f 8: 

12 3 4 5 i\ 

2 1 3 5 9 3 

The maximum score possible is thus obviously reduced by the presence of ties, and it is 
evident that the presence of each tied pair reduces the maximum possible score by unity, 
so that it becomes \n(n - 1 ) - ;> 2 for the case of a ranking of n members containing p z pairs. 
Thus for such a ranking r would be defined as 

r = 28j{n{n - 1 ) - 2 p 2 }, 

w here 8 is the observed score. 

Generally, each r-tuplet tie reduces the maximum possible score. by \r(r-~ 1)* so that for 
a ranking of n members containing p t pairs, /> 3 triplets p r r-tuplets, 

28 

~ *(* — 1) — — ... — r(r— 1 )p r ' 

The sum of the frequencies of the possible scores 
When no ties are present, each permutation of the n members produces a possible score so 
that there are in all n! possible scores. When ties are present they decrease the number of 
possible permutations of an assigned set of members, but, on the other hand, they give rise 

* This result lias been given by Kendall (1945). 



G. P. SlLLITTO 37 

to farther families of scores due to the different places in the ranking which can be occupied 
by the tied members. Thus, for instance, the rankings 

113456, 122456, 123356, 123446, 123455 

all give rise to the maximum score, 14. 

Considering any assigned ranking, the number of possible permutations with p t pairs 
present is n ! / 2**«, or if there are in addition p a triplets, ...,p r r-tuplets, n !/( 2 ! )'’* (3 !)**•... (r!)" f . 

The distribution of scores of an assigned set of ranks will be referred to as the basio dis- 
tribution for the type of ranking concerned, since consideration of the possible ways of 
assigning the p 2 pairs, p 3 triplets, etc., among the members of the ranking has only the effect 
of multiplying the frequency of each score by a constant factor. This factor is the number of 
ways of distributing the p x + p 2 4 p 3 -\ ... + p r ranks among the n members. This is the 
number of possible permutations 

(Pl + Pi + Pa+;-+Pr)'- 

p 1 '.p 2 \p»\...p r l 

Basic frequency distributions of the scores 

The basic frequency distributions can be established by an extension of the methods given 
by Kendall. Considering first the case of tied pairs the frequency function of the basic 
distribution of the scores may be written f(S, n,p 2 ), where is the number of pairs. The 

frequency generating function is then 77 , p 2 ) t s >. Now consider the addition of another 

j 

tied pair, with a greater ranking than any of the existing ranks. If it is added to the extreme 
left of the ranking it adds - 2n to the score. Moving one of the pair one place to the right 
adds 2 to this new score; bringing the other added member up to it adds another 2. Starting 
again with both the new members on the extreme left, movement of one of them two places 
to the right adds 4 to the new score, bringing the other up to it in two steps each of one place, 
adds successively a further 2 and 4. Proceeding in this way all possible additions to the old 
score which may be brought about by the addition of a tied pair of new members are repre- 
sented by the array 

-2n -(2»-2) -(2n-4) -(2n-H) ... 0 

-(271-4) — (2?i - 6) -(2w -H) -f 2 

-(2w-8) — (2n— 10) 4*4 

-(2w- 12) 

4* 277 

Thus the addition of a new tied pair has the effect of multiplying the frequency generating 
function by 

{t~ 2M 4- (t (2n 2) 4 t < 2n ~ 4 >) 4- (<~< 2w * 4) 4- H 2n ~ fl) 4* t i2n H) ) 4* ... 4- (/° 4- / 2 4- ... 4- P")}. 

The addition of a single new member to the ranking has the effect of multiplying the 
frequency generating function by 

{< t * 4 - 1 u 2) 4* t (w " 4) 4- ... 4 t n } 

as shown by Kendall, the presence of tied pairs in the existing ranking having no effect. 



38 


Coefficient of rank correlation in rankings containing ties 

With these two recurrence relation there is no difficulty in drawing up a table of basic 
frequency distributions for tied pairs as exemplified in Table 1 , in which only positive values 
of the score are shown, negative values being obtainable by symmetry. 


Table l . Distribution of the score 8 for values of n from 3 to 7, and for rankings containing 
p t pairs of members ranked equal (only positive half of symmetrical distribution) 


„ 


i 

f „ , 








Values of N 











rl 

! 0 

1 


I 3 

i 

4 

5 ; « 7 

f 1 

8 

! 

i 

1 # 

j 

10 

u ; 

1 

12 13 

: 14 

15 

i is 

i 

i 

I n 

* 

18 

J ; 

19 

^ 20 

i 2 

! 

3 


1 

2 


i 

1 


. * i 

| 

j 



: 


. 

i 

| 

1 | 


t 

1 

| 

3 

i 

1 

• 

I 




. 

j • 

i 

I 

j 

j 

, 

. 


i . 

i 

• i 


i . 

i . 

4 


a 


5 


: » 

. i . : . 


; 

i 

. 1 


, 


! 

i 

i 

• i 


i • 

1 

4 

i 


3 


2 

i 

i ; • • 


I • 




> . 

, 



. ! 


i . 

i 

4 

2 

2 


1 

' 

: i 

i 








j • 


♦ 

* [ 


i 

i * 


5 

5 

i 

22 

11 

20 

| 

! 9 

‘ 15 

| . 9 ! . 

i « .13 

4 

j 1 

| * 

• 

! 

| 



i 

i 

i 

• | 


i 

i 

1 

i 

i ^ 

j • 

fi 1 

2 1 

! fl 


5 | 


: 4 

I 2 : . 

1 

i 


• j 

{ 

i . j 

i j 


1 




1 

1 • 

j * 

6 ! 



KM 


90 

i ! 

! 71 1 . 49 


t 

29 

i i 
1 • • 

14 i 

. ! 5 


1 

; 






ft 

i 

5 2 


49 | 


i 41 

| . 1 30 , . j 

19 


: 10 j 


4 

*1 j 


j 

i 

. | 



i . 

e 

2 


20 


23 

1 

! is i . ; 12 ! 


| 7 


3 

1 



i . 

i : 

i * ' 

! 




6 

« i 

14 


12 ! 

i 


! 11 

| . ! 7 : . | 

5 

j 

i i 

i 

1 

; • i 


i 

i . ! 

. j 




7 

1 

. ! 


573 

I 

. i 

531 

i 

! - 

455 ! . ;359 ’ 


259 


109 

. ! 98 ' 


49 

j 

; 20 ! 

. i 

0 i 

i 

1 

7 1 

i ! 

292 


281 | 


250 

. ,2*15 ! 

154 i 


105 


04 ! . i 

34 


! 15 


5 1 

, i 

i I 

; 

7 ! 

2 


140 

• 

135 


ii5 . ; 90 : 


04 


41 

. 23 


11 


4 


1 ! 



*1 

3 1 

f 

74 i 

i 


72 i 

i 

, * 

03 ’ 

i 

52 

3S 


j 20 ; 

• « 

15 j . i 

0 j 


: 3 


i 

! 

i 




Before the construction of the table has proceeded far, however, it becomes evident that 
there is a recurrence relation between individual frequencies for any given value of », such 
that the frequency of any score 8j for p a pairs is the sum of the frequencies of 1 and 
8 f -f 1 for p a + 1 pairs. This obviously arises from the fact that if two members ranked equal, 
say rth, in a ranking with p 2 + 1 pairs are subsequently distinguished and given rankings r 
and r 4 - 1 , this will increase the score by unity if the (r f 1 ) member falls after the (r) member 

when the ranking is arrayed against another ranking in the natural order 1, 2, 3 n, and 

reduce it by unity if the other member of the pair becomes the (r 4* 1 )th; and these two possi- 
bilities complete the ways of forming a ranking with p t pairs from one with p 2 4- 1 pairs. 

This simple relationship, which may be written 

/(S y , a, ft) « /(Sy + 1 , n, p t + 1 ) 4*/(Sy - 1 , n f p 2 + 1 ), 

or taking another way of writing the basic distribution function 

$(^y» Vv V a) ” 1> Pi~2, ;> a + l)+0(8y— 1, ft — 2, ft+ l)> (1) 

ft being the number of members not in tied pairs, is of great assistance in tabulating the 
frequency distribution, and will be used below to establish the formula for the variance of S . 
1 1 can be generalized to cover the effect of increasing the number of r-tuplets, when it becomes 

Pi' Pr Pr V Pr) * Wi-T- 1, ft - 1, ft, .... Pr-X- 1, ft + 1) 

-f0($y — r- 3, ft — 1, ft» jp r — 1 — 1. Pr+ 1) 

4- * 

+ + r- 1, ft - 1, ft, Pr- 1 - 1. Pr + 1). 


( 2 ) 



G. P. SlLLITTO 


39 


Frequency and probability distributions or the score 8 
From a table of basic frequency distributions such as Table 1 the construction of a table 
showing the probability of attaining or exceeding an observed value of 8 by chance from an 
uncorrelated pair of rankings can obviously be constructed, and Table 2 shows such pro- 
babilities (positive S only, negative values obtainable by symmetry) for values of » up 
to 10, and all possible numbers p a of paired and p t of triplet ties. 


The variance or 8 

The variance of 8 when ties are present can bo readily derived by using the recurrence 
relations given above and the value given by Kendall for the case of no ties. For the case 
of tied pairs consider 

(8 + 1 )•#«+ 1, Pv - 2, p a + 1 ) + (8- 1 )W~ 1 , Pi - 2, Pt+ 1) 

= 8*{0(8 + l,p 1 -2,p a +l) + <f>(8-l,p 1 -2,p a +l)} 

+ 2{(S+\)<t>(S+l,p l -2,p a +l)-(8-l)<l>(8-\,p l -2,p a +l)} 
-{<f>(8+l,p i -2,p a +l) + $(S-l,p i -2, p t + 1)}. 

If now both sides of this equation are summed over all values of 8, the terms on the left- 
hand side become n i 

( 2! ) Ps+ i v a r 0(«S, Pi - 2, p a + 1). 

The first terms on the right-hand side become by virtue of the recurrence relation (1) 

(2!)r » var?i( ‘ S '’ Pv Pi) ’ 

the second vanishes through the symmetry of the distribution, while the third becomes 

2 .n\ 

( 2 !)*vu* 

Hence there is obtained 

var $4(A>, p L — 2, -f 1 ) = var$J(iS\ p v p 2 )~ 1, 

and so var^(>S\ n — 2/> 2 , p 2 ) = var cf>(8< n) — p % 


n(n~ 1) (2 n-f 5) 

18 ' ~ Vz ' 

using Kendall’s result. 

These results can also be generalized, using equation (2), to deal with multiplet ties, 
obtaining 


vH,T<f>(8, Pi- 1, . .. ,p r _! - 1 , Pr+ 1) * var <f>( 8, p x ,..., p r _ ,, p r ) - (r 2 - 1 )/3, 


and var0(tS\ p v p 2 , .... p r ) = 


n(n— 1) (2» + 5) 
18 


3+8 

Pi-.. A--- 


3 + 8+ ... + (r 2 — 1) 


fr- 


it is obvious from these equations for multiplet ties that for any given number of tics of 
each multiplicity the variance will tend towards that of the system without, any ties as n 
increases. 


Application to a practical case 

The following results were obtained in a practical case in which two different tests were 
carried out on one each of a set of products. The problem is to determine the degree of 
relationship between the results of the two tests. It is also an instance of an occurrence 



40 


Coefficient of rank correlation in rankings containing ties 


which arises at times in practice, in which some of the results are ‘off the scale ’ of measure- 
ment with respect to one of the tests ; these, twelve in all, have been given a tied ranking of 1 8. 


Test A 

40*80 

41*70 

36*75 

37*55 

29*40 

25*20 

26*75 

28*45 

26*85 

26*35 

Test B 

1*5 

1*5 

1*5 

2*5 

3*5 

10 

2*5 

2*5 

2*5 

6*6 

Test A 

21*40 

19*65 

18*95 

22*90 

22*80 

20*25 

24*45 

22*70 

26*50 

— 

Test B 

>10 

>10 

>10 

>10 

>10 

>10 

>10 

>10 

>10 

— 

Test A 

22*00 

27*50 

23*75 

30*80 

21*00 

27*10 

22*10 

19*25 

25*45 

24*10 

Test B 

>10 

3*5 

1*5 

2*5 

7 

6*5 

7*5 

9 

3*5 

>10 

Ranking the results according to their order in test A (from highest values to lowest) there 

is obtained 

1 2 

3 

4 5 

6 7 

8 

9 10 

11 12 

13 14 

15 



1 1 

5 

1 5 

10 5 

10 13 5 

5 18 

13 10 

18 



16 17 

18 

19 20 

21 22 

23 24 25 

26 27 

28 29 




18 18 

1 

18 18 

18 16 

18 18 15 

18 18 

17 18 




The lower ranking has 1 pair, 1 triplet, 1 quadruplet, 1 quintuplet and one 12-member 

29 x 28 

multiplet. The maximum possible score is ~ -1-3-6-10-60 = 320 in such a 


212 

ranking. The actual score is + 212, giving r - - 0*6625. The variance of the distribution 

oZU 

of the scores obtained with such rankings in the case of no correlation is 


29.28.63 11 26 50_638 

Is 3 3 3 3 


2599*33. 


Hence the probability of obtaining a score of 212 or more from an uncorrelated pair of 
such rankings corresponds to the probability of a normal variate attaining or exceeding 

= 4*158 times its standard deviation. 



REFERENCES 

Kendall, M. G. (1938). A now measure of rank correlation. Biometrika, 30, 81. 

Kendall, M. G. (1942). Partial rank correlation. Biometrika , 32, 277. 

Kendall, M. G. (1943). The. Advanced Tfmry of Slatwtics, 1, chapter 16, Rank Correlation. London: 
C. Griffin and Co. Ltd. 

Kendall, M. O. (1945). The treatment of ties in ranking problems. Biometrika, 33, 239. 





[41 ] 


THE USE OF RANGE IN PLACE OF STANDARD 
DEVIATION IN THE <-TEST 

By E. LORD, B.Sc., Shirley Institute , Didabury , Manchester 

CONTENTS 

PAGE 

(i) Introduction ............ 41 

(ii) The t- test ............. 42 

(iii) The modified test (a* test) based on range . . . . . . . 43 

(iv) Computation of percentage points of the distribution of u = u(\, n), i.e. case 

with m — 1 . . . . . . . . . 44 

(v) Computation of percentage points of the distribution of u = u(2,n), i.e. case 

with m = 2 49 

(vi) Computation of percentage points of the distribution of u = u(m f n) for m> 2 52 

(vii) Approximate? values of the percentage points of u ..... 53 

(viii) Applications of the a test .......... 54 

Appendix. On the independence of mean and some linear estimates of standard deviation 

in random samples from a normal population ...... 61 

(i) Introduction 

The difference between highest and lowest values has always been recognized as a general 
indication of the variability of quantitative data. It was not, however, until 1925 that 
attention began to be focused upon the range as a useful statistical tool. In his paper ‘On 
the extreme individuals and the range in samples taken from a normal population \ Tippett 
(1925) obtained an expression for the mean value of the range in repeated random samples, 
and computed its value in terms of the population standard deviation for samples of size 
n — 2 to n = 1000. He also gave numerical approximations to the values of the moments 
of the range for fairly large samples. 

The work was taken up by E. 8. Pearson (1920), who determined numerically* the exact 
values of the moments of the random sampling distribution of the range for small samples 
of size n < 0, and also approximations to their values for samples of medium size. In a 
subsequent paper E. 8. Pearson (1932) tabulated the upper and lower percentage limits for 
the distribution of the range from frequency curves fitted with the values of the moment 
coefficients taken from both of the earlier papers cited above. 

The next advance was the determination of a general expression for the distribution of 
the range in samples of n random values from any population by McKay & Pearson (1933). 
For the normal population, only in the case of n = 3 was it found possible to obtain a fairly 
simple analytical form. (The distribution for n = 2 is, of course, well known, taking the 
form of the positive half of a normal curve.) 

Hartley ( 1 942) later determined an expression for the probability integral of the range and, 
with Pearson (1942), tabulated this for the normal population for samples between n = 2 and 
n = 20. This latter paper also contains a table of several percentage limits of the range 
in samples from a normal population. These limits are derived from the numerical values of 
the probability integrals and replace the approximate values previously given by Pearson 
referred to above. 

Tippett (1925) and Pearson (1932) have pointed out that although the total range in a 
sample may be used for the purpose of estimating the population value of the standard 



42 


Range in place of standard deviation in the t4est 

deviation, a more efficient measure may be obtained by dividing the sample into random 
subgroups of equal size and using the mean range of the subgroups in place of the total range. 
The efficiency of range estimates of standard deviation is, of course, always less than that of 
root-mean-square estimates, but the work of Davies & Pearson ( 1 934) and Pearson & Haines 
(1935) indicates that information is not discarded to any serious extent providing that the 
number of observations in the subsamples is not greatly in excess of about 10. 

As a result of the work outlined above, the range is now of considerable importance in 
many fields, especially in industrial quality control, where its simplicity has enabled it to 
be extensively and easily applied to the measurement of fluctuations in the variability of 
quality of a manufactured article or material. 

In the present paper an investigation is made of the use of range estimates of standard 
deviation in the consideration of the statistical significance of deviations of sample means in 
normal random sampling theory. This use of range estimates of standard deviation is 
analogous to the use of root-mean-square estimates in the well-known /-test. Tables are 
given, at several probability levels, and these may be employed in determining the statistical 
significance of either the deviation of a sample mean from some fixed or hypothetical popula- 
tion value, or the difference between the means of two samples. These tables may also be 
used for obtaining rapid estimates of the accuracy of a sample mean from the variation 
within the sample as measured by the range. The use of range, in place of root-mean-square 
estimates of standard deviation, in this modified form of the /-test. necessarily entails some 
loss of precision. It will, however, be shown in a future paper that this reduction in accuracy 
is negligible for all practical purposes. Furthermore, this slight disadvantage of the new test 
is compensated by its greater simplicity, involving a reduced amount of computing compared 
with the usual /-test. 

The range test is suitable for application to many problems frequently encountered in the 
treatment of various types of experimental data and in considering the mean character value 
in small samples in biological experiments. In the industrial field, the range test may be 
used for detecting changes in mean quality level, especially where the variation is not 
under strict statistical control or is subject to secular changes, or for determining whether 
the average level of a batch determined from a sample is in accordance with specification 
demands. A number of these problems are covered in the examples given at the end of the 
paper. 


(ii) The /-test* 

In testing the significance of the deviation of a sample mean x from an assumed population 
value £, use is made of the ratio | ^ _ g | 

t= s'QN~ ’ (1) 

where N is the size of the sample and 8 is the root- mean -square estimate of the population 
standard deviation determined from the sample. In applying this ratio it is assumed that 
the N values form a random sample from a normal population of which the mean is £, 
standard deviation <r and the distribution of values of x is given by 

1 (*- £>* 

p{x) - ™ ■ <2) 


* ‘Student’ (1908), R. A. Fisher (1925). 



E. Lord 


43 


More generally t may be defined as the ratio 

t = xja, (3) 

where x and a are statistically independent, x being a quantity distributed normally about 
a mean of zero and a a root-mean-square estimate based on v degrees of freedom of the 
standard error of x. Although the use of the tables of the probability integral of t enables 
the most efficient tests to be made of the various forms of the so-called ‘Student’s Hypo- 
thesis’, occasions frequently arise when more rapid tests are desirable, especially if accom- 
panied by only inappreciable loss of accuracy. The calculation of a, depending upon the 
squaring of numerical quantities, entails a certain amount of labour, especially if tables of 
squares or a calculating machine are not available. The use of the range, or the mean range 
determined from random subgroups in a sample, enables very rapid estimates to be made 
of the population value of the standard deviation <r. In the following section these range 
estimates are used in place of root-mean-square estimates in a modified form of the t-test. 


(iii) The modified test (m-test) based on range 
Here we replace the a of * Student’s ’ ratio by an estimate of cr based on the range. Thus 


u = u(m, n) = 


x 

w(m,n)/d n ’ 


(*) 


where x is a quantity distributed normally about a mean of zero and w(m, n) is the mean 
value of to ranges w, obtained from to independent samples or subgroups, each containing 
n observations. The constant d n , in a commonly used notation,* is the expected value of the 
range in samples of n, randomly selected from a normal population of unit standard devia- 
tion. The ratio w(m, n)jd n is therefore an estimate of the standard error of x obtained from 
range and, as such, replaces the root-mean-square estimate a used in the ratio t = x/s. 

Except for a few special cases, it has not been found possible to determine the analytical 
form of the distribution of u, but several tables of percentage points have been computed 
for use in testing the various statistical hypotheses normally covered by the t-test. The 
computation of these tables is considerably simplified by first determining the percentage 
points of the distribution of the subsidiary quantity 


q = q(m, n ) = 


u(m, n) 

~dT 


x 

w(m,n)’ 


( 6 ) 


and the multiplying by the corresponding value of d n to obtain the percentage points of the 
u distribution. 

To simplify the algebraic expressions in what follows, u, w and q will be written for u(m, »), 
w(m,n) and q(m,n) where no confusion is involved. 

The distribution of both u and q are clearly independent of cr. Hence, without any loss 
of generality, <r may be taken equal to unity in considering the distributions. The distribution 
of x will therefore be defined by 


p(x) = 



( 6 ) 


♦ See, for example, Pearson (1935), pp. 84 and 90. 



44 


Range in place of standard deviation in the t-test 


Furthermore, let the distribution of the range w in a sample of n be y = p(w) f that of w 
be y = p(w), and that of q be y = p(q ). Then since x and w are defined to be statistically 
independent, we have the distribution of q given by 


p(q) = jp(w)p(x)dwdx, 


(?) 


where the integral is to be evaluated over the field of all values of x and w subject to the 
relation (5) and to the conditions: 

— cc<x<cc, 0<tt><oo. (8) 

Since x is distributed symmetrically about zero, and w is independent of x , the ratio q is 
also symmetrically distributed about zero. Let q a be the value of q such that a is the chance 
that \ q\ >q a . The quantity a represents the total area of the two equal tails of the dis- 
tribution lying outside deviations ± q a} and we have 


(1-a) « 2 f *!>(?)<*?■ 

J o 


( 9 ) 


Alternatively, from (6), (7) and (8), this may be written in the form 

P+MVZ a 


1 -a = 


C°°( _ \ 

Jo [pM | _ p{x)dx^dw 

wq a 


poo / pit 

lo HJ. 


v '(2 n) 


er* x ~ dx\du\ 


( 10 ) 


Except for a few cases in which the analytical form of the distribution of u has been 
obtained, equation (10) has been used to compute values of q a and, hence, the percentage 
points of u for values of a — 0*10, 0 05, 0*02, 0 01, 0*002 and 0*001, with values of n from 
2 to 20 and values of rn from 1 to 10, 15, 20, 30, 60 and 120. 

The percentage points of u = u(m , n) are first considered for the case when the estimate of 
standard deviation is based on the value of a single range of n random values (i.e. for m = 1 ). 
This treatment is followed by the case of m — 2, and finally consideration is given to the 
general case using estimates of standard deviation determined from the mean of m ranges 
each from an independent subgroup of ft random values. 


(iv) Computation of percentage points of the distribution of 
u = u(l, n ), i.e. case with m = 1 

Throughout this section the estimates of standard deviation are all based upon the value of 
the range in a single set of n random values of the variate (thus w = w). In the case of n = 2 
and n = 3 analytical solutions are derived for the distributions from which the percentage 
points of q and u are calculated. For n ^ 4, percentage points in the neighbourhood of those 
desired are determined by quadrature methods, and the required points obtained from these 
by interpolation. 

Special case n = 2, m = 1 

The distribution of ranges (w) in samples of two random values from a normal population 
with unit standard deviation is the distribution of absolute differences between random pairs 
of variate values, and this may be easily shown to be 

p(w) dw = dw (0 < w < oo). 

q7T 


(ii) 



E. Lord 45 

Since x and w are independent, it follows from (6) and (11) that their joint probability 
distribution is 

p(w,x)dwdx — — ^ e~ 1(x ‘ +i u ’ 2, dwdx. (12) 

7 T y2 

Transforming to new variables, q = xjw and w and noting that the Jacobian of the trans- 
formation 

d(x, w) 

s 'll) 

d(q, w) 

the joint distribution of q and w is given by 


p{q, w) dqdv) 


its} 2 


e - tu.*(9«+i) tydqdw. 


(13) 


To obtain the distribution of q it is necessary to integrate (13) over the whole field of w, 
from 0 to oo. This gives 

dq 


Hence from (9) and (14) above, the percentage points of q are given by 


(14) 


(1-a) 


= 2 f 
*7 2 Jo 


2 dq 

(</ 2 +F) 


= 7r tan ~ 1 (V 2 ?<*)> 


and hence ? a (l,2) = J 2 tan |^(1 -a)j = J 2 cot (vrj- ( 15 ) 

The values of q a determined from (15), for the six values of a under consideration, are 
multiplied by d 2 - 2 /V 7r gi ye Me required percentage points of the distribution of 
n — u(1 , 2). 


Special case n = 3, m = 1 

For random samples of size n = 3 from a normal distribution with unit standard deviation, 
the distribution of the range has been found by McKay & Pearson (1933) and takes the form 

f'ui/V'B 


6 




1 


e-^dz. 


Again, since w and x are independent, their joint distribution is given by 


p(w,x)dwdx = — —r~ e ~l (xt +* wt) dwdx e~* z2 dz. 
77 y 7 T J 0 


(16) 


Transforming to new variables q = xjw and w , it follows from (16), since the Jacobian of 
the transformation is equal to w y that the joint distribution of q and w is given by 


3 Cwly/S 

p(q,w)dqdw — — e* ^’W+Vwdwdq e~* e *dz. 

7 T y 7 T J q 


(17) 



46 


Range in place of standard deviation in the t-test 

To obtain the distribution of q the expression in (17) has to be integrated over the whole 
field of w, from 0 to oo. Thus 


o /* OO /Ml o/V® 

p(q) = — .-I e~ iw ^ ql+i) wdw\ dz. 

11 \ n J o Jo 

Putting t — Zy]6, the above may be written 


P(9) = 


jv** 

... - 3 r 1 - er in*<iHi) r e ^ dt i 

W(^)L(? a +i) Jo J 


w-cn 

ur= 0 


+ 


3 

7rV(67r) (? a + £) 



e -«’*/12 e -» w *(< j »+ J ) dw 


(18) 


The first expression in (18) is clearly zero and hence 


P(<l) = 


3 

rrj(f>7r)(q s + i) 



e ~itr*(0 8 +f) fin, 





(19) 


As before 
and therefore 


®tan _1 ^ - 

(^+l)(? 2 +f)* * l(2 + 3? a )‘(’ 


(2 + fe‘ = to, '(s (1 -“| 


If tan (1 — a)j be denoted by r, then 

?«(1,3) 


V2r 

( 1 — 3t 2 )* * 


(20) 


The six required values of q x are found by substitution of the corresponding values of a in 
(20) above, and further multiplication by d 3 = 3 jyjn gives the percentage limits of the 
distribution of u = «(1, 3). 


General case n^4, m — 1 

For n > 4 no suitable algebraic expression exists for the distribution of the range, but 

pw 

Pearson & Hartley (1942) have tabulated values of the probability integral p(w)dw to 

Jo 

4 figures at intervals of 0-05 of w for values of n from 2 to 20. Hartley kindly lent manuscript 
tables of the integral tabulated to 5 figures at intervals of 0-25 of w for values of n from 
2 to 16. Using these five-figure tables for n — 4, 6, 10, 16 and the four-figure tables for n = 20, 

the frequency distribution of w was obtained numerically by subtraction of successive values 
rw 

of J p(w) dw at intervals of 0-26 and then converting these class frequencies into ordinates 

y(w). The degree of approximation in the formula used implied the vanishing of fifth 
differences (see K. Pearson, Tables for Statisticians and Biometricians, Part n, p. xvii). 



E. Lord 


47 


Eaoh case was treated in turn, and the six values of the percentage points q a , corresponding 
to the six different values of a under consideration, were determined using the relations 
given in (10) above. Taking a trial value of q a , the integrals 

(2i> 

were calculated at intervals of 0-25 over the whole range of w. Quadrature was then applied 
to the products y(w) I(w f over the range 0 ^ w < oo, to obtain the value of (1 - a) corre- 
sponding to the trial value of q a . This procedure was repeated a number of times to obtain 
values of (1 — a) corresponding to a series of equidistant values of g a . The required values of 
q a corresponding to the six values of a under investigation were then obtained by backward 
interpolation. 

As n oo the ratio w/d n tends to the population value of the standard deviation. Further- 
more, for n = 2 and n = 3, exact values of q a had been previously obtained by direct cal- 
culation for the six values of a. Thus it was possible to make initial estimates of the required 
values of q at and the process of this ‘trial and error’ method was not found too laborious. 


Table 1 . Framework values of percentage points of u = u( 1 , n) 


a 

n 

010 

0-05 

0-02 

0-01 

0-002 

0001 

2 

50376 

10-1381 

25-389 

50-791 

253-97 

507-95- 

3 

2-5935- 

3-8225 1 

6-188 

8-819 

19-84 

28-08 

4 

2-1793 

2-9505+ 

4-213 

5-420 

9-42 

11-75+ 

(1 

1-9354 

2-4755+ 

3-249 

3-900 

5-71 

6-66 * 

10 

1-8064 

, 2-2390 

2-807 

3-244 

4-32 

4-82 

Hi 

1-7496 

2-1385+ , 

2-628 

2-990 

3-82 

4-19 

20(a) 

1-7320 

2-1083 

2-576 

2-916 

3-69 

4-01 

20(6) 

1-7314 

2-1074 | 

2-576 

2-916 j 

3-69 

4-02 

00 

1-6449 

1-9600 

i 

2-326 

2-576 j 

3-09 

3-29 


The framework values of the percentage points of u(l,n) were obtained by multiplying 
the values of </ a by the corresponding values of the mean range d n tabulated by Tippett 
(1925) and are given in Table 1, together with the exact values for n = 2 and n = 3 
determined above. 

As a check on the accuracy of determination of these percentage points, the six values 
for n = 20 were also calculated by a second method. Writing 


r = 1/? = d n lx, (22) 

the method is to determine, for the six values of a, the corresponding values of q a such that 


a = p(r)dr+ p(r)dr , 

J — oo J 


(23) 


where y — p(r) denotes the frequency distribution of r. Since q is distributed symmetrically 
about zero, its reciprocal r is also distributed symmetrically about zero, and from (22) 

and (23) it follows that roo 

l p(r)dr 
J Vq* 

fx/Qa 


a 


r® rx/Qo t 

= 2 p{x)dx p(w)dw . 

Jo Jo 


( 24 ) 



48 


Range in place of standard deviation in the t-test 


Ordinates of the normal curve y(x) — ~ r . - -e “*** were taken at intervals of 0-25 for x in 

vw 

the range 0 ^ < oo from K. Pearson’s Tables for Statisticians and Biometricians , Part i. 

Taking a trial value of 1 jq a , the integrals p(w) dw were calculated from Hartley’s four- 

Jo 

figure tables for each value of x in the above range. By quadrature applied to the products 

rxfq a 

p(x) p(w) dw the integral (24) was evaluated. By taking a series of equidistant values of 
Jo 


1 lq a other trial values of a were determined. Jiackward interpolation was then used to 
obtain the required values of q a corresponding to the six values of a under consideration. 
Finally, the six percentage points of u were determined by multiplying the values of q a by 
d 2 o given in Tippett’s tabic. These percentage points are given in the penultimate line (6) 
of Table 1, and comparison with the corresponding values in the line above, (a), indicates 
good agreement between the two methods of computation. 

Since the percentage points of u for n = 4, 6, 10 and 16 have been determined using 
Hartley’s five-figure manuscript tables of the cumulative frequency distribution of w, the} 7 
should be at least as accurate as the percentage points for n = 20 determined from the 
four-figure tables. 

Changes in the percentage points at the different levels of significance run most smoothly 
if arguments proportional to 1 jn are used in place of n, and reciprocals of u for the variate. 
Using a six-point general Lagrangian formula applied to the points corresponding to 
n = 3, 4, 6, 10, 16 and 20, values of percentage points of u were determined for w — 5, 7, 8, 1 4 
and 18. (In the case of n — 20 the mean values of the percentage points determined by the 
two methods were used.) The interval was then halved, using a nine-point Lagrangian 
through points corresponding to n — 4, 6, 8, 10, 12, 14, 16, IS and 20. Finally, the six sets 
of percentage points were differenced as a check, reduced by either one or two figures and, with 
the exception of those for n = 5, are given in the second columns of Tables 3-8 under m = 1 . 

For n = 5. the six percentage points of •*/ were independently determined at a later stage 
of the investigation by the method used above for n — 4, 6, 10, 16 and 20, and it is these 
values which are given in Tables 3-8. In the table below, the values obtained by inter- 
polation from the framework values are compared with those determined by direct 
calculation. 


Percentage points of u{ 1,5) 


a = 


By direct calculation 
By interpolation 


010 

0-05 

0-02 

! : 

| 0-01 i 

i i 

0-002 I 

i 

0-00 1 

2-019 

2035 - 

3-56 

4-38 

6-8 

8-2 

2-020 

2-635 *■ 

3-56 

! 

4-38 v 

6-8 

8-2 


In one case only, for a = 0*10, is there a difference as much as one unit in the last figure, 
the actual values obtained being 2*0192 by direct computation and 2*0198 by interpolation. 

Taking all the various checks into consideration, it appears unlikely that the values of 
the percentage points of u(l,n) given in the tables at the end of the paper are in error by 
more than one unit in the last place. The values forn = 2 and n = 3 are, of course, exact. 



E. Lord 


49 


(v) Computation ok percentage points of the distribution of 
u = u(2, n), i.e. case with m = 2 

The exact distribution of u(2, n) cannot be determined analytically except in the case of 
n = 2, and hence the various percentage points have necessarily to be evaluated almost 
wholly by numerical methods. 

The probability of the joint occurrence of a pair of ranges w ' , and w" from random samples 
of equal size n from a normal population of unit standard deviation is given by 

p(w' ,w")dw' dw" = p(w')p(iv")dv>'dw". (25) 

If w be the mean value of the two ranges, then its distribution is obtained by integrating (25) : 

p{w)dw = jp(v/) p(w”) dw' dw" , (26) 

the integration being taken over the whole field of w' and w " subject to the conditions 

w = \(w‘ + w") (0 < w' < oo, 0 ^ w* < oo). (27) 

We shall change the variables from w f and w" to w and w\ the Jacobian of the transformation 
being equal to 2. With these new variables, and noting from (27) that w' varies from 0 to 
2 u\ equation (26) gives 

r2ic 

p(w) = 2 p(w') p(2w — w f )dw'. (28) 

Jo 


Special case n — 2, rn = 2 

In equation (11) above is given the distribution of the range in samples of two random 
values from a normal population with unit standard deviation. Substituting this in (28) 
above, the distribution of means of two independent values of w is therefore given by 

C2 ir 1 1 

j)(w) = 2 | e~* w ' 2 . e~M 2w ~ ,r )2 dw' 

J Jo \ l7r \l * 

(*2 tr 

_ “ e dir’ 


V( 2 ") J. V( 2 ”> 

Using the notation \ 

the expression (29) for the distribution of the mean range may be written in the form 


(29) 


P(w) = ^ 7r jC-‘"’’/(M)). (30) 

We may now proceed to determine the distribution of the ratio q = x/w. Since they are 
independent, the joint distribution x and w is, from (6) and (30), given by 

4 

p(x , w)dxdw = - I(w)dxdu\ (31) 

7T 


Bioraetrika ^4 


4 



50 


Range in place of standard deviation in the t-test 

Transforming to variables q and w, noting that the Jacobian of the transformation is equal 
to w, and integrating, we obtain 


p(q) = - f C — iw>*(«*+ 1 ) 10 dw. 

n J o 


( 32 ) 


dl(w) _ 1 

dw ~ <](2 rr) * 


,-lu» 


Now since . 
we make use of the identity 

J | 

_{«-*»W+i) = — w(q i + l)rlW«) j&)+ —t~. 

to evaluate the integral in (32) and obtain 


P{q) n<J(2n)(q i + 1)J 0 


/; 


2 ft— ^d>W 


2 

~ n(q 2 + l) J(q* + 2)' 

The ratio q is distributed symmetrically about zero and from (9) and (33) we obtain 

_4p« dq 
a ~ n)o (? 2 + l )\/(? 2 + 2 )’ 

and the percentage points are therefore given by 

= ?a( 2 . 2 ) = 


(33) 


(34) 


where 




Substitution of a = 0*10, 0*05, etc., in (34) gives the required values of q ay and further 
multiplication by d 2 = 2/^/7r gives the required corresponding percentage points of the 
distribution of u(2 , 2). 


General case n ^ 3, m = 2 

For w - 4, 6, 10, 16 and 20, the ordinates of the distribution of the range have been 
previously evaluated above at intervals of 0*25 for w , and these are used in place of the 
unknown p(w). Taking a particular value of w , quadrature was then applied to the products 
y(w)y(2w — w) to obtain a numerical estimate of p(w) from equation (28). This process was 
repeated at intervals of 0*25 for w through as much of the range 0 ^ w < oo as was necessary 
to obtain the required degree of accuracy. For n = 3, determination of the ordinates of the 
distribution of w was also based on quadrature, but, in this case, exact figures for the 
ordinates of the distribution of the range were obtained from its equation found by MoKay 
& Pearson (1933). Because of the rapid rise of the distribution near to the origin, estimates 
of the lower values of p(w) in this case were determined using an interval of ^ for w in order 
to obtain the requisite. accuracy. For higher values of w the interval was progressively 
decreased to 0-25 used over the tail portion of the curve. 



E. Lord 


51 


Treating in turn each curve of the distribution of w (for n = 3, 4, 6, 10, 16 and 20), the 
values of the percentage points of the distribution of the ratio 


u x 



were computed by a method similar to that used in evaluating the percentage points of q 
for the case m = 1. Taking trial values of q ai the integrals 


2 C Ufq a 

were calculated at intervals of 0-25 for w. Quadrature was then applied to the products 
y(w) I(w , q a ) over the range 0 w < oo to obtain corresponding values of (1 — a), the sum of 
the two tails of the distribution beyond deviations ± q a . Repeating this procedure, a series 
of values of (1 — a) were obtained, corresponding to a set of equidistant values of q a . Back- 
ward interpolation was then used to obtain the six values of q a corresponding to the six 
values of a under consideration. Finally, the required percentage points of u were obtained 


Table 2. Framework values of percentage points of u = u(2, n) 


a 

n 

010 

0*06 

0*02 

0*01 

0*002 

0*001 

2 

2*6203 

3*8671 

6*266 

8*932 

20*10 

28*47 

3 

2*0201 

2*6366+ 

3*656+ 

4*340 

6*78 

8*66 

4 

1*8760 

2*3672 

3*047 

3*600 

6*07 

6*80 

0 

1*7791 

2*1920 

2*727 

3*133 

'4*11 

4*52 

10 

1*7226+ 

2*0926 

2*561 

2*884 

3*64 

3*96 

10 

1*6961 

2*0470 

2*472 

2*776* 

3*44 

3*71 

20 

1*6879 

2*0329 

2*449 

2*742 

3*38 

3*64 


by multiplication by the appropriate value of d n> and are given in Table 2, together with 
the exact values for n = 2 determined from (34). 

As before, Lagrangian formulae were used to interpolate the intermediate values of the 
percentage points of u, again taking arguments proportional to 1 jn and reciprocals of the 
percentage points for the variate. As a check, the values were inspected by determining 
differences up to the third order and then reduced by one place of decimals. With the 
exception of the values for n — 5, the reduced values arc given in Tables 3-8. 

For n = 6 the six percentage points of u were independently determined at a later stage 
of the investigation by the same methods used for the framework values. These directly 
computed values are given in the tables mentioned above, and are also reproduced below 
for comparison with those obtained by interpolation from the framework values. 


Percentage points of u = u( 2, 6) 


a = 

0*10 

0*05 

0*02 

0*01 

0*002 

0*001 

By diroct calculation 

1*814 

2*254 

2-84 

3*29 

4*4 

5*0 

By interpolation 

1*814 

2*254 

2-84 

3*29 

4*4 

4*9 


4-2 



52 


Range in place of standard deviation in the t-test 

In the case of a = 0-001, direct calculation gives a value of 4-97 compared with 4-92 
obtained by interpolation. For other percentage levels the agreement is exact to the number 
of figures quoted. 

(vi) Computation of percentage points of the distribution of 
u = u(m , n) for m > 2 

The variance of the mean range in rn subgroups of equal size n steadily decreases as m 
increases, and the ratio w(m,n)/d n gives closer estimates of the population value of the 
standard deviation of the variate. Hence, following the usual methods of large-sample theory, 
the limiting values of the percentage points of u 9 for indefinitely large w, may be determined 
from integral tables of the normal curves. For a given value of a, the limiting values of the 
percentage points are, of course, equal for all values of n, and are also equal to the corresponding 
limiting values of the percentage points of Fisher's ^-distribution for an indefinitely large 
number of degrees of freedom. 

In general it was found for a particular value of a and of n that a three-point Lagrangian 
curve with 1 jm as argument and reciprocals of the percentage points of u as variate (passing 
through points corresponding to rn = 1 , 2 and oo) may be used for interpolation of the 
required percentage points corresponding to values of m intermediate between 2 and oo. 
Only in the case of n = 2 and n = 3 was the required accuracy not attained by this procedure 
and further investigation found necessary. Details of the methods used are given below. 

In the case of n = 2, the percentage points of the distribution of ?/(m, 2) were also deter- 
mined for rn = 4 and m = 8 as follows. First, considering rn = 4, it was necessary to obtain 
numerical estimates of the ordinates of the distribution of the means of four ranges, each 
range from a random sample of two values from a normal population with unit standard 
deviation. Following the method used for m — 2 and leading to equation (28), it is easy to 
show that the distribution of w(2m,n), the mean of 2m independent ranges each from a 
random subsample of n values, is given in terms of the distribution of the mean, w(m, n), of 
rn such ranges by 

P2w(2m, n) 

p(w(2m,n)) = 2 p(w(m,n))p(2w(2rn, n) — w(m,n))dw(m, n). (35) 

fo 

Using numerical values for p(w) (m = 2, n = 2) given by equation (29) above, estimates 
of the ordinates of the distribution of p(w) for m = 4, n = 2 at intervals of 0-25 were found by 
quadrature methods similar to those described in previous sections by using the above 
expression. A repetition of this process, using these last computed values, yielded numerical 
estimates of the distribution of the means of eight ranges, i.e. m — 8. Again applying quad- 
rature to the two distributions, values of (1 — a) were determined for a series of equidistant 
values of q a (m = 4 and m = 8). The required values of q a , corresponding to the six values 
of a between 0*10 and 0-001 under consideration, were then obtained by backward inter- 
polation, and hence the percentage points of u( 4, 2) and u( 8, 2). 

The sets of six percentage points of u(m , 2) were determined for each required value of rn 
by Lagrangian interpolation, reciprocals of m being used as argument and reciprocals of 
the corresponding percentage points as variate, the curve passing through the points corre- 
sponding to m = 1, 2, 4, 8 and oo. The interpolated values of percentage points obtained by 
this method, and the directly computed values for m = 4 and m = 8, are given in Tables 3-8 



E. Lord 


53 


for a series of values of m suitable for practical use. As a check upon the method, the com- 
putation was repeated, this time using a four-point Lagrangian passing through points 
corresponding to m = 1, 2, 4 and oo. Most of these interpolated four-point values of the 
percentage points agree exactly with the five-point values previously obtained. In cases 
where differences arise, none exceed 1 unit in the last figure. The five-point Lagrangian 
method of interpolation therefore certainly appears to be quite adequate for furnishing the 
required degree of accuracy. 

Numerical estimates of the distribution of the means of pairs of ranges from subsamples 
of size n = 3 and n = 4 have already been obtained above. Using these in turn in equation 
(35), estimates of the ordinates of the distributions of the means of four ranges were deter- 
mined at intervals of 0-25 by quadrature methods. The sets of percentage points of w(4, 3) 
and u( 4,4) were then computed by the previous method of trial values and subsequent 
backward interpolation. 

For n = 3 and n = 4, the percentage points of the distribution of u , for given values of m, 
are lower and nearer to their limiting values than the corresponding points for n = 2. 
Furthermore, the changes in the values of the percentage points for small values of m are 
also less abrupt. In view of the agreement between the four-point and five-point Lagrangian 
interpolated values of the percentage points for n = 2, a four-point Lagrangian through 
points corresponding tom= 1, 2, 4 and oo may certainly be relied upon to give adequate 
accuracy for the interpolation of percentage points corresponding to intermediate values 
of m in the case of n = 3 and n = 4. These values, together with the computed values for 
m = 4, are given in Tables 3-8. 

For the remaining values of n, from 5 to 20, the interpolated percentage points given in 
Tables 3 8 have been obtained by means of a three-point Lagrangian curve, using values 
of the percentage points corresponding to values of m = 1,2 and oo. As in the previous cases 
of interpolation, reciprocals of m and the variate were used in order to obtain small changes 
in successive differences. To show that this method is adequate, the six sets of percentage 
points for n — 4 were also interpolated using a three-point Lagrangian. In every case except 
one, these values agreed exactly with the four-point Lagrangian interpolated values pre- 
viously found and given in Tables 3-8. In the case of the sole exception, the difference 
between the two interpolated values was only one unit in the last figure. For the less rapidly 
changing values of the percentage points of u for n ^ 5, the three-point method of inter- 
polation therefore provides sufficient accuracy for the present purpose. 

Taking all checks into consideration it appears that the tabulated values of the per- 
centage points of the distribution of the function u = u(m, n) may be relied upon to the 
accuracy given: occasionally the values may be one unit in error in the last figure. 

In Tables 3 and 4, the 10% and 5% points of u were computed to 3 decimal places, 
but lack of space has necessitated these being curtailed for publication. For the same 
reason the values of the percentage points of u for the odd values of n — 11, 13... 19 have 
been omitted. In practical applications of the test, it is not considered that this reduction 
will cause any undue inconvenience. Fuller tables have, however, been retained for 
forthcoming work on the power of the w-test and are available for consultation if required. 

(vii) Approximate values of the percentage points of u 
If there are m subgroups each of n values, and if the estimate of standard deviation is 
determined as the root-mean -square of the deviations of variate values from the respective 



54 


Range in place of standard deviation in the t4est 

means of the subgroups, then the number of degrees of freedom is v * m(n — 1). Unlike the 
usual t- test, when the estimate of standard deviation is determined from the mean range in 
m subgroups of equal size n , the percentage points of the modified f-distribution investigated 
above depend upon the relation between m and n. Reference to Tables 3-8 indicates that, 
for a constant number of degrees of freedom v = m(n — 1 ), the values of the percentage points 
on a given probability level vary slightly as m and n vary. For example, taking a = 0*05 
and v = 8, we have the following percentage points: 2*272 for m = 1 and n = 9, 2*254 for 
m = 2 and n = 5, 2*250 for ra = 4 and n = 3, and 2*264 for m = 8 and n = 2. In general, 
however, the range in the values of the percentage points of u for a given value of v is small, 
and this permits the construction of a table giving approximate values of the six sets of 
percentage points corresponding to different numbers of degrees of freedom. 


Approximate values of percentage points of u 


Degrees of 



Values of a 



v = m(n — 1 ) 

010 

005 

0-02 

001 

0002 

0*001 

1 

50 

10 * J 

25-4 

50*8 

254-0 

507-9 

2 

2-6 

3*8 

6-2 

8-9 

19-9 

28-3 

3 

2-2 

30 

4-2 

5-5 

9-4 

1 1-8 

4 

20 

2-6 

3*6 

4-4 

6-8 

8-3 

5 

1*9 

2-5 

3-3 

3-9 

5-7 

6-7 

6 

1*9 

2-4 

31 

3-6 

51 

5-9 

7 

1*8 

* 2-3 

30 

3-5 

4-7 

5-4 

8 

1*8 

2-3 

2-9 

3 3 

4-5 

5-0 

9 

1*8 

2-2 

2-8 

3-2 

4-3 

4-8 

10 

1*8 

2*2 

2*7 

3*1 

4*2 

4-0 

11 

1*8 

2-2 

2*7 

31 

4-1 

4-5 

12 

1*8 

2-2 

2-7 

31 

40 

4*3 

13 

1-8 

2-2 

2-7 

31 

3*9 

4-3 

14 

1*7 

21 

2*6 

30 

3*8 

4-2 

15 

1*7 

21 

2-6 

2-9 

3-7 

4-1 

16 

1*7 

21 

2-6 

2-9 

3-7 

41 

17 

1*7 

21 

2-6 

2-9 

3*7 

4-1 

18 

1*7 

21 

2-6 

2-9 

3-6 

! 4-0 

19 

1*7 

2*1 

2-6 

2-9 

3-6 

40 

20 

1*7 

2*1 

2-5 

2-8 

30 

j 3-9 

30 

1*7 

2-0 

2-4 

2-7 

3-4 

3-6 

60 

1*7 

2-0 

2-4 

27 

3-2 

3-5 

120 

1*7 

2-0 

2-4 

2-6 

3-2 

3*4 

00 

1-64 

1*96 

2*33 

2*58 

309 

3-29 

I 


For a particular pair of values of ra and n, the values of the percentage points for 
v = ra(n — 1) degrees of freedom given in the table above are generally not in error by more 
than one unit in the last place of figures. This degree of accuracy is frequently sufficient for 
many practical applications of the distribution of u. To settle the significance of cases giving 
values of u close to the above approximate values, reference should be made to the accurate 
values given in Tables 3-8. 

(viii) Applications of the u - test 

The difference between the mean of a sample of n random values of a normally distributed 
variate and the population value is shown in the Appendix to be independent of the total 



55 


E. Lobd 

range in the sample, and also independent of the mean range determined from random sub- 
groups of values. The modified f-test based on range estimates of standard deviation may 
therefore be used in various statistical tests of significance involving deviations of sample 
means. The application of this range test to sampling problems is analogous to that of the 
well-known f-test, and no detailed description is therefore required. The most frequent use 
of the new test will be found in the treatment of experimental data of various types, and 
also in the examination of test results recorded for the purpose of control of the quality of 
industrial products. In this latter type of work, cases frequently arise when it is desirable to 
apply a rapid test for determining the significance of a difference between the mean of a 
sample and some preassigned value, frequently some desired control level, or the significance 
of the difference between two sample means. Furthermore, for routine purposes, it is often 
desirable that the test should not only be rapid but also of a simple nature, thus enabling it 
to be used by workers with little mathematical or even arithmetical aptitude. The new range 
test has the advantages of greater simplicity and greatly reduced amount of computing 
compared with the standard J-test. The use of range estimates of standard deviation, in 
place of root-mean-square estimates, necessarily entails some loss of precision, but in a future 
paper it will be shown that this reduction in accuracy is small and certainly negligible for 
most practical purposes. 

The most frequent applications of the range test are considered below and are followed 
by several numerical examples in which, for purposes of comparison, the parallel treatment 
by the t - test is also given. As in the <-test, the application of the range test involves the 
assumption of normality of variate distribution and randomness of sampling. Furthermore, 
where the standard deviation is estimated from the mean range of several subgroups of 
values, care should be taken to ensure that the arrangement of these values is also random. 
This latter condition is usually fulfilled by considering the values in the order in which they 
were originally recorded. In a few cases, however, the order of recording may not be random; 
the particular circumstances of a test may be such that the order of the observations may be 
wholly or partly dependent upon their magnitude. In such cases a set of values can be 
divided into random subgroups by the use of tables of random sampling numbers or by 
other means. 


(a) Difference between sample mean and population mean 

Suppose we have some preassigned value £, and wish to test whether the mean x of a 
sample of N values may be considered as a reasonable estimate of £, or whether the difference 
between x and £ is real in the statistical sense. The usual assumption, the so-called ‘ Student’s 
Hypothesis’, is made that x is the mean of a random sample from a normal population of 
which the mean is £ and standard deviation is cr. The differences (x — £) will be distributed 
about a mean of zero with a standard error equal to cr/^jN. If the sample be divided into m 
random subgroups of equal size n,N = mn, and w is the mean of the m ranges of the subgroups, 
then the sample estimate of the standard error of the mean is w/(d ny /N). The ratio of the 
difference between the means to the estimate of its standard error is 

( 36 ) 

w 

If the computed value of u exceeds the corresponding percentage point in one of Tables 
3-8, then the difference is considered unlikely to have arisen through random sampling on 



56 


Range in place of standard deviation in the t-test 


that particular probability level a. As in the case of the (-test, when considering the asym- 
metrical case of ‘ Student’s Hypothesis the values of a at the headings of the tables should 
be halved. 

For fairly small values of N, the estimate of the standard error of the mean may be deter- 
mined, not from the mean range in subgroups, but from the total range between the maximum 
and minimum values in the sample. In the notation used above, this corresponds to m — 1 
and n — N. The test of the significance of the difference may be made as above and the 
computed value of u compared with the percentage points in Tables 3-8. In these cases, 
however, the computation may be curtailed by using the ratio 


8 ___ w(l,n) 
w “ d H \jn ' 


(37) 


where | x — l; | = 8, and w is the range in the undivided sample. Table 9 gives values of the 
ratio 8/w for various levels of significance corresponding to the sum of the two tails of the 
distribution. For a chosen level of significance the difference 8 is considered too large to 
have arisen through random sampling errors if the value of 8/w exceeds the corresponding 
tabulated value. Table 9 will also be found useful for giving a rapid estimate of the accuracy 
of the mean based on a small number of observations. 


(b) Difference between two sample means 

Suppose the first sample of size N x be divided into m i random subgroups of size n, and the 
second sample of size N 2 be divided into m 2 random subgroups also of size n, i.e. 


n — N 1 /m 1 — N 2 /m 2 . 

The hypothesis is made that each sample can be considered as a random selection from the 
same normal population. Let the numerical value of the difference between the two sample 
means be | x x — x 2 |, and the mean of the (m, + m 2 ) ranges of n values be w — w(m x + m 2 ,n ), 
giving an estimate w\d n for the standard deviation of the variate. The ratio of the difference 
between the two sample means to the range estimate of the standard error of the difference is 


(38) 


\Xy~Xz\d,, 

w^llN.+ HN^ 

The significance of the difference between the means in any particular case can be deter- 
mined by noting whether the computed value of u exceeds the corresponding percentage 
point for a chosen value of a by reference to Tables 3-8, using the column headed m = m y -f m 2 . 

When the samples are small and of equal size, say n, the variate standard deviation can 
be estimated from the two total ranges in the samples. If w f and w* are the two ranges, with 
a mean value w = \{w' + w *), then 


u = - - 


*l-*sl d n \l($ n ) 


w 


(39) 


may be used as above for testing the significance of the difference between the two means. 
A more rapid test may, however, be made by simply determining the value of the ratio of 
the difference between sample means to the average of the two sample ranges 

1 * 1 - 1 = u(2,ri) 

\{w' + w”) d„ ^l(kn) ' 

In Table 10 are given values of the above ratio lying on six different probability levels. For 



E, Lord 


57 


a given level of significance a, values of the ratio smaller than those tabulated may be con- 
sidered to have arisen through random sampling errors; greater values indicate that a given 
difference is unlikely to have arisen through chance and therefore point to a real difference. 

In the computation of u it is necessary to use values of d ni the mean range in samples from 
a normal population of unit standard deviation. A selection of the values determined by 
Tippett (1925) is reproduced in Table 11 to avoid the necessity of frequent reference to his 
original paper, and is accompanied by the corresponding values of ^ Jn and d n ^n. 

(c) Confidence intervals 

As with ‘Student’s* test, the tables of percentage points may be used to estimate with 
a given measure of confidence, the interval within which it can be stated that £ or £ x — £ 2 ties. 

Examples 

Example 1 . The following data have been previously used as an example by ‘ Student ’ 
(1908). Ten patients were treated with the optical isomers of hyoscyamine hydrobromide 
and the additional hours of sleep were noted. 


Additional hours sleep gained by use of hyoscyamine hydrobromide 


Patient 

Dextro -(D) 

Laevo -(h) 

Difference (D — L) 

1 

4-0-7 

+ 1-9 

+ 1-2 j 

2 

— 1*6 

+ 0-8 

+ 2-4 1 

3 

— 0-2 

+ i-i 

+ 1-3 | 

4 

-1-2 

+ 0-1 

+ 1-3 

5 

-0-1 

-01 

0-0 1 

6 

4-3-4 

+ 4-4 

+ 1-0 

7 

4-3-7 

+ 5-5 ! 

+ 1*8 i 

8 

4-0-8 

+ 16 

! +0-8 1 

9 

0-0 

+ 4-6 

i +4-6 

10 

4-2-0 

+ 3-4 

+ 1-4 j 

Means 

! 1 

4-0-75 

+ 2-33 

+ 1-58 j 


The last column may be used for the controlled comparison of the two drugs, since their 
effects were measured on the same ten patients. The laevo form has given a greater figure for 
the additional hours sleep than the dextro form. Whether the former may be considered as 
the better soporific is examined by both the standard deviation and range tests. 

(a) The sum of squares of deviations of the differences about their mean value is 13-616, 
associated with 9 degrees of freedom. The estimate of the standard error is therefore 0-3890, 
and the value of t works out to be 1-58/0-3890 = 4*06. For 9 degrees of freedom a value of 
t = 3-250 lies on the 1 % level of significance. Assuming normal random sampling, a value 
of t equal or greater than 4-06 will occur much less frequently than once in a hundred times. 
This leads to the conclusion that the laevo form is better for producing sleep than the dextro 
form. 

(b) For examination by the range, the value of u = u(\ , 10) may be computed, but in this 
case it is simpler to use the shortened method of equation (37). The ratio of the mean differ- 
ence to the range in the ten individual differences is 8/w — 1-58/4-6 — 0-34. Reference to 
Table 9 shows that this value is slightly in excess of the tabulated value 0-333 on the 1 % 
level of significance, leading to the same conclusion as that drawn from the $-test. 



58 Range in place of standard deviation in the t-test 

The greater significance suggested by the t - test seems to be largely due to the exceptional 
difference D — L for Patient No. 9, viz. 4*8, which affects 8 more seriously than w. 

Example 2. In the calibration of a viscometer it is necessary to time the interval required 
for the level of an aqueous solution of glycerol to fall between two fixed marks. For satis- 
factory calibration it is considered desirable that the mean time of flow should be accurate 
to ± £ sec., risking a greater error not more frequently than 1 in 20 times. Five independent 
determinations of the time interval (in seconds) for one viscometer were 103*5, 104*1, 102*7, 
103*2 and 102*6. While this number of observations is clearly too small for a final assessment 
of accuracy, it is often useful to get an interim answer to guide further action. 

(а) The sum of squares of the deviations of the five observations about their mean is 
1*508, associated with 4 degrees of freedom, giving an estimate of the standard error of the 
mean equal to 0*275. Reference to tables shows that a value of t equal to 2*776 lies on the 
5 % level of significance. Hence in 1 9 times out of 20 it would be expected that a sample mean 
will not diverge from the true mean value by more than ± 2*776 x 0*275 = ± 0*76 sec. This 
error exceeds the assigned limits of 4 ji sec. and therefore points to the necessity of further 
tests to fulfil the required conditions. 

(б) Instead of computing an estimate of the standard error of the mean from the range 
(w = 104*1 — 102*6 = 1*5) in the five determinations, we note from Table 9 that a value of 
S/w = 0*507 lies on the 5 % level of significance. Hence in 19 times out of 20 the sample 
mean will differ from its true value by an amount up to a deviation of 

±S * ± 0*507 x 1*5 = ±0*76, 
a result in agreement with that yielded by the /-test. 

Example 3. In the processing of raw cotton, modifications were made in the design of one 
of the machines with the object of improving the efficiency of cleaning. Tests were made 
on a series of 24 different mixings for the purpose of determining whether yarn strength was 
adversely affected by the mechanical alterations. The results of the 24 pairs of comparisons 
are given below (the strength being expressed as a count x strength product), together with 
the differences between them expressed as percentages of the corresponding strengths under 
standard conditions. 


Yarn strengths under standard and modified conditions 


Strength 

Percentage 

Strength 

Percentage 

Standard 

S 

Modified 

M 

difference 

100 (M-S)jS 

Standard 

8 

Modified 

M 

difference 
100 (M-S)/S 

1805 

1763 

-2-3 

1931 

1898 

-1-7 

1870 

1901 

+ 1-7 

1508 

1520 

+ 0-8 

2000 

2020 

+ 1-3 

2111 

2119 

+ 0-4 

1823 

1904 

4-4-4 

1496 

1481 

-1*0 

1003 

1619 

1 +1*0 

1672 

1723 

+ 3-1 

1889 

1830 

-3-1 

j 

1947 

1759 

— 9-7 

2058 

2019 

-1*9 

1960 

1934 

-1-3 

1806 

1850 

4-2-4 

1624 

1594 

-1*8 

1056 

1112 

4- 5-3 

2162 

2170 

4*0-4 

1867 

1782 

— 4-0 

1915 

1967 

4-2-7 

1801 

1720 

— 4-5 

1738 

1810 

4-4-1 

2094 

2144 

4-2-4 

1609 

I 

1613 

+ 0-2 



E. Lord 


59 


The mean value for the percentage difference in strength is - 0*40. Whether this is an 
indication that the mechanical modifications have resulted in the production of weaker 
yarns is examined by means of the standard deviation and range tests. 

(a) The sum of squares of the deviations of the percentage differences about their mean 
value is 266*48, based on 23 degrees of freedom. The estimate of the standard deviation of 
the percentage differences is 3*34 and the standard error of their mean value is 0*08, giving 
a value of t = 0*46/0*68 = 0*69. This is much below the value of 2*069 on the 6 % level of 
significance and leads to the conclusion that there are no grounds for suspecting that the 
mechanical alteration^ have led to the production of weaker yarns. 

( b ) The number of observations place this case outside the range of Table 9, and it is 
therefore necessary to use the modified ^-function. The data are arranged in random order 
of their occurrence, and split into four groups of six. The ranges in the sets of six differences 
are 7*6, 9*8, 12*8 and 6*9 with a mean value w( 4, 6) == 9*0. The estimate of the variate standard 
deviation is w( 4, 6)/d e = 3*56, giving 0*72 for the standard error of the mean percentage 
difference, and u = 0*46/0*72 = 0*64. The 6 % level of significance is, from Table 4, equal to 
2*07, much greater than the value computed from the data and therefore indicates the same 
conclusion as above. 

Example 4. Independent determinations of percentage trash content were made in tri- 
plicate on two samples of raw cotton and the following results obtained: 


Percentage trash content of raw cotton 


Sample A 

Sample B 

M3 

0*76 

1*31 

0*64 

1*26 

1-01 

Means 1-23 

0-80 


The point to be decided is whether sample B may be said to be cleaner than sample A, 
or whether the difference between the two average percentage trash contents may be 
accounted for by random experimental variation. Since, in this case, the comparisons are 
not paired, the standard error of the difference between the mean values of the two samples 
has necessarily to be estimated from the variation within each of the two sets of results. 
As before, normal variation in sampling and in testing errors is assumed. 

(а) The sum of squares of deviations of each set of values from their mean is 0*01680 for A 
and 0*07167 for B , each associated with 2 degrees of freedom. The best estimate of the error 
standard deviation is therefore 0*149, giving 0*122 for the estimate of the standard error 
of the difference between the two means. The value of t is equal to ( 1*23 — 0*80)/0*122 = 3*6 
which exceeds the value 2*776 obtained from tables for 4 degrees of freedom and a = 0*06. 
On this level of significance the result is taken to indicate a real difference in the cleanliness 
of the two cottons. 

(б) The difference between the two sample means is 0*43 and the mean of the two ranges 
is 0*276. Hence, using the ratio of equation (40), | #!*- # 2 |/£(w' + w *) = 1’6 which, from 
Table 10, is seen to exceed the \alue of 1*272 lying on the 6 % level and therefore is taken to 
indicate a significant difference in the mean values. 



60 Range in place of standard deviation in the t-test 

Example 5. The following strength test results were obtained on two batches of cotton 
yarn (measurements recorded to the nearest £ lb.) and are noted downwards in order of 
random occurrence: 


Sample A 

Sample B 

30-5 

310 

29*5 

27*0 

28*5 

28-0 

31*5 

27*5 

28*5 

25*0 

29*5 

30-0 

28*0 

26*5 

28*0 

28-0 

27-5 

26*0 

27*0 

27*6 

28*5 

29-5 

28-6 

27*0 

27*5 

29*5 

27-5 

26-6 

28-6 

28*5 

27-5 

28*0 

27-0 

28-0 

28*0 

280 

32*5 

30-0 

28-0 

26*0 

29-5 

28*5 

28-6 

26-0 

26*6 

30-5 

29*0 

31-0 

290 

28*0 


The mean of sample A is 28-90 lb. and 27-40 lb. for sample B, and the question arises as 
to whether sample B is actually weaker than A. 

(a) The sums of squares of the deviations about their respective mean values are 68*2 
for A and 24-8 for B, associated with 29 and 19 degrees of freedom. The estimate of the error 
standard deviation is therefore 1-392, giving 0-402 for the standard error of the difference 
between the two means and a value of l equal to (28-9- 27-4)/0-402 - 3-7. For 48 degrees of 
freedom a value of t = 2-68 lies on the 1 % level of significance. The greater value of 3-7 
yielded by the data above indicates, therefore, that the difference in strength of the two yarns 
may be accepted as statistically significant. 

(b) The estimate of the error standard deviation is obtained from the ranges within groups 
of ten values, three groups for sample A and two for sample B. The values of these five 
ranges are 3-0, 5-0, 5-0, 4-0 and 3-5 with a mean value of 4-1 and a corresponding estimate of 
error standard deviation equal to w( 5, 10)/d 10 = 1-33. The estimate of the standard error of 
the difference between the two means is 0-384 and the value of u is (28-9 - 27-4)/0-384 = 3-9. 
For five ranges of ten, the value of u on the 1 % level of significance is, from Table 0, equal to 
2-69 (cf. 2-68 for t with 48 degrees of freedom). The value of 3-9 obtained from the data is 
greater than this value of 2-69 and this again leads to the conclusion that the difference in 
mean strengths of the two yarns is ‘statistically significant’. 

Note added in proof. Since the present paper went to press, a note by Daly (1946) has 
been published, in which it is suggested that the range may be used in place of the root- 
mean-square estimate of variance in a test analogous to the <-test. The case where the 
estimate of standard deviation from a single range Is discussed and values of the ratio 
(deviation) /(range) on the 10% level of significance are given to two significant figures 
for a number of low values of n. These agree with the corresponding values given in 
Table 9, for a = 0-10, of the present paper. [Mr Lord’s paper was first submitted for 
publication in August 1946. Ed.] 



E. Lobd 


61 


APPENDIX 


On the independence of mean and some linear estimates of standard 
deviation in random samples from a normal population 

In the above practical applications of the u distribution to normal random sampling pro- 
blems, it has been implicitly assumed that range estimates of standard deviation, like root- 
mean-square estimates, are independent of the mean of the sample from which they have 
been determined. The validity of this assumption is established below, where it is shown as 
a particular case of a more general theorem. 

Consider a set of n random values of a variable from a normal population of distribution 

pl * )d * - (1 > 

Of such a set let x p and x q denote the pth and qth values ( p < q) in ascending order of magni- 
tude, and denote the remaining (n — 2) values such that 


-cc<x r ^x p , r = 1,2, 1), 

x p ^x r ^x g9 r = (p+l),(p + 2), 
x q ^x r <co 9 r = (?+l),(? + 2), 


( 2 ) 


Now a set of any n values may be arranged in n ! ways and in random samples all arrange- 
ments of the same n values are of equal probability. The group of (p— 1) values all less than 
x p are not ranked in any particular order, and there are hence (p— 1)! ways in which they 
may be arranged. Similarly, the group of values from x p+1 to x q ^ may be arranged in 
(q — p— 1)! wavs and the third group from x qn to x v in ( n — q) \ ways. The distribution of 
random samples in which the pth and qth values in ascending order are denoted by x p 
and x q9 and the remaining values satisfy the conditions in (2), is therefore given by 


X <2„)T.v r '. 6x l , [ - s^i (JV-B’Jc (3) 


where the constant term in brackets makes the total frequency of all such samples equal to 
unity. 

The joint distribution of the sam pie mean (x) of the n values and the difference A = (x ll — x p ) 
is given by 


p(x,A)dxdA 


to ! 

(p- l)\(q-p-l)\(n-q)\ 


( 2 to )1 


!‘”o-J •J eXp “2cr2 r 5 1 (Xr_ ^ )2 dX ' dX * 


■ dx„ 


(4) 


where the multiple integral is evaluated over the domain of the x’a conditioned by the limits 

1 r=an 

indicated in (2) and by A = (x q -x p ) and x = £ # r - 

n r =*i 



62 


Range in place of standard deviation in the t-test 

Make the transformation to variables defined by 

* “ - ( x l + x i + ... +*»), 

yi = -x p +x l , 

Vi = -x v +x* 


y P -i = -x P +*„-!> 

Vp+i — ~ x p ' >rX P +\> 


( 6 ) 


Vn = ~ x p 

The Jacobian of the transformation is 


+ x„. 
d(x 1 ...x„) 


d(x,y v ...,y p .. v y p+v ...,y n ) 

and (5), the transformed limits of integration are 


= 1, and, from (2) 


-oo<.y r sSO, r = 1,2, 1), 

0^y r ^^l, r = (p+l),(p + 2), 
A^y r <co, r = (q+ l),(?+2), 
V a = A. 


Using the relations in (5) above it may easily be shown that 

n 


r=n 

S ( x r-£)* = n( x -£)* + '- 

r— 1 


_ 1 r=-n O *^ n 1 t^n~s 

— S rf-- 2 2 

n r= i n , i 


(7) 


where in the summation on the right r,$,$ + f+p. With the new variables we have, from 
(4), (5), (6) and (7), the joint distribution of x and A given by 


<Jn (n — 1 ) ! dA 

(p — 1 )! (q — p— 1 )! (ft — q)\ (27r)K r,_1) cr n ~ l 
/*o |*o rA rA r oo f 00 r 1 

*J . . -J *>*« -J. *><-'), -J, exp L~^ 

/ r » a— n~- 1 l—n—8 \“1 “1 

X (n-i)Zy?-2 2 2 y.y.J Uy» . (8) 

l r 1 a 1 f-l U J 

with the restriction that r, 5, s + 1 + p, and zl is to be substituted for y q . 

The term in the first bracket of (8) is the distribution of the sample mean x. It follows, 
therefore, that the term in the second bracket is the distribution of A , because this expression 
does not involve x but is a function of A alone. This indicates that, in random samples from 
a normal population, the difference between the pth and qth values in order of magnitude 
is independent of the sample mean. It follows, therefore, that all estimates of the population 
standard deviation cr determined from ranked variate differences (e.g. from the semi- 
interquartile range or other percentile measures of dispersion) are independent of the 
corresponding sample mean. 

As a special case, when p = 1 and q = n, the difference between the pth and gth values 
becomes the difference between the lowest and highest, i.e. the range of the sample. Further- 
more, if the values in a sample be divided into random subgroups, a simple extension of the 
argument shows that there is also statistical independence between sample mean and the 
corresponding mean range of the subgroups. 


p{x, A)dxdA = 


exp 




2<7 2 J 


J(2n)or/Jn 


dx 



E. Lord 


63 


Table 3. 10 % points of u — u(m, n) 


\ 

m 

\ 

n ' 

\ 

1 

2 

3 

4 

5 

6 

8 

10 

15 

20 

30 

60 

2 

504 

2*62 

2*20 

2*03 

1*94 

1*89 

1*82 

1*78 

1*73 

1*71 

1*69 

1-67(1) 

3 

2-50 

2*02 

1*88 

1-81 

1*77 

1*76+ 

1*72 

1-71 

1-69 

1*67 

1*00 

1*06(1) 

4 

218 

1*88 

1-79 

1*75+ 

1*73 

1-72 

1*70 

1-69 

1-67 

1*07 

1*66 

1-05+ 

6 

202 

1*81 

1*75+ 

1*73 

1*71 

1*70 

1*08 

1*08 

1*67 

1*60 

1*00 

1*65 

6 

1*94 

1*78 

1*73 

1*71 

1*70 

1*69 

1*68 

1*07 

1*66 

1*00 

1*65+ 

1*05“ 

7 

1*88 

1*76 

1*72 

1*70 

1*69 

1*68 

1*67 

1*07 

1*00 

1*00 

1*05+ 

1*05- 

8 

1*86 

1*74 

1*71 

1*69 

1*68 

1*68 

1*07 

1*00 

1*06 

1*05+ 

1*06+ 

1*05" 

9 

1-82 

1*73 

1-70 

1*69 

1*68 

1*67 

1*67 

1*00 

1*06 

1*65 4 

1*65 

1*65- 

10 

1-81 

1*72 

1*70 

1*08 

1*68 

1*67 

1*66 

1*60 

1*05+ 

1*65+ 

1*05 

| 1*06- 

12 

1-78 

1-71 

1*09 

1*68 

1-67 

1-67 

1*60 

1*06 

1*65+ 

1*05+ 

1*05 

1*65- 

14 

1-76 

1*70 

1*68 

1*07 

1*67 

1*66 

1*66 

1*00 

1*65+ 

1*65 

l*66~ 

1*65- 

16 

1-75 

1*70 

1*68 

1*67 

1-67 

1*66 

1-66 

1*05+ 

1*65+ 

1*65 

1*05~ 

1*65- 

18 

1-74 

1*69 

1*68 

1*07 

1*06 

1*66 

1*00 

1*05+ 

1*05+ 

1*65- 

1*65 

1*65- 

20 

i 

1-73 

1*69 

1*67 

1*67 

1*66 

1*66 

1*00 

1*05+ 

1*05 

1*05- 

1*05~ 

1*05- 


Table 4. 5 % points of u = u(m, n) 


m 

n 

1 

2 

3 

4 

5 

6 

8 

10 

15 

20 

30 

60 

2 

10*14 

3*87 

2*98 

2*66 

2*49 

2*38 

2*26 

2*20 

2-11 

2*07 

203 

2 * 00 ( 2 ) 

3 

3-82 

2*04 

2*37 

2*25 

2-19 

2*14 

2*09 

2*07 

2*03 

2-01 

1-99 

1 * 98 ( 1 ) 

4 

2 * 95 + 

2*37 

2*22 

2 * 15 “ 

211 

2*08 

2*05 

2*03 

2*01 

2*00 

1-98 

1*97 

5 

2*03 

2 * 25 + 

2 * 1 5 - 

2*10 

2*07 

2*05 

2*03 

2*01 

2*00 

1*99 

1*98 

1 * 97 ( 1 ) 

6 

2*48 

2*19 

2-11 

2*07 

2 * 05 “ 

2*03 

2-01 

2*00 

1*99 

1*98 

1*97 

1 * 97 ( 1 ) 

7 

2*38 

2 * 15 + 

2*09 

2 * 05 f 

203 

2*02 

2*01 

2*00 

1*98 

1*98 

1*97 

1 * 97 ( 1 ) 

8 

2*32 

2-13 

2*07 

2*04 

2*02 

2*01 

2*00 ! 

1*99 

1*98 

1*98 

1-97 

1 * 97 ( 1 ) 

9 

2*27 

2-11 

2*06 

2*03 

2*02 

2*01 

2*00 

1*99 

1-98 

1*97 | 

1*97 

1*96 

10 

2*24 

2*09 

2*05 

2*02 

2*01 

2*00 

1*99 

1*98 

1-98 

1*97 

1*97 

1*96 

12 

1 2*19 

2*07 

2*03 

! 2*01 

2*00 

2*00 

1-99 

1*98 

1*97 

1-97 

1-97 

1*96 

14 

2*10 

2*06 

2*02 

l 2*01 

2*00 

1*99 

1-98 

1-98 

1-97 

1-97 

1*97 

1 -96 

16 

2*14 

2 * 05 ~ 

2*02 

; 2*oo 

1*99 

1*99 

1*98 

1*98 

1*97 

1-97 

1*97 

1*96 

18 

2*12 

2*04 

2*01 

2*00 

1*99 

1*99 

1*98 

1*98 

1*97 

1*97 

1*97 i 

1*96 

20 

■ 

2*11 

i 

2*03 

2*01 

2*00 

1*99 

1*98 

1*98 

1*97 

1-97 

1-97 

1*90 

i 

1-96 


Note , The numbers in brackets in the column headed m = 60 indicate the number of units which must 
be subtracted in the second decimal place to obtain the level for m =120 and the same value of n. Where 
no figure is given w(120,w) = n(0O,n) to second decimal place accuracy. E.g. for the 6% level, 
u(120, 2) = 1*98. 



64 


Range in place of standard deviation in the t-test 


Table 5. 2 % points of u - u(m, n) 


n \ 

1 

2 

3 

4 

5 

6 

8 

10 

15 

20 

30 

60 

2 

25-39 

6-27 

4-27 

300 

3-27 

3-08 

2-86 

2-73 

2*59 

2*52 

2*45-* 

2*39(3) 

3 

6-19 

3-56 

3-05- 

2*84 

2-72 

2-05- 

2*56 

2-51 

2*45 

2*42 

2*39 

2-36(2) 

4 

4-21 

3 05- 

2-77 

205- 

2-58 

2-53 

2-48 

2*45“ 

2*41 

2*39 

2-37 

2-35 (1) 

6 

3-56 

2-84 

2-65“ 

2-50 

2-51 

2-48 

2-44 

2*42 

2*39 

2*37 

2*36 

2*34(1) 

6 

3-25 

2-73 

2-58 

2-51 

2*47 

2*45" 

2*42 

2-40 

2*37 

2*36 

2*35 H 

2*34(1) 

7 

3-07 

2-66 

2-54 

2-48 

2-45* 

2-43 

2-40 1 

2-39 ! 

2*37 

2-36 

2-35~ 

2*34(1) 

8 

2-95+ 

2-61 

2-51 

2-46 

2-43 

2*42 

2-39 j 

2*38 

2*36 

2-35+ 

2-34 

2*34(1) 

9 

2-87 

2-58 

2-49 

2-45- 

2-42 

2*41 

2*39 

2-37 

2-36 

2-35+ 

2*34 

2-33 

10 ! 

2-81 

2-55+ 

2-47 

2-44 

2*41 

2-40 

2*38 

2-37 

2*35+ 

2-35' 

2-34 

2*33 

12 

2-72 

2-51 

2-45- 

* 2-42 

2-40 

2-39 

2-37 

2-36 

2*35+ 

2-34 

2-34 

2*33 

14 

2-67 

2-49 

2-43 

! 2-41 

2-39 

2-38 

2*37 

2-36 

2*35" 

2-34 

2-34 

2*33 

16 

2-63 

2-47 

2-42 | 

| 2-40 

2-38 

2-37 

2-36 1 

2-35 < 

2*35- 

2-34 

2-34 

2*33 

18 

2-60 

2-46 

2-41 

2-39 

2-38 

2-37 

2-36 

2-35 1 

2-34 

2-34 

2-33 

2-33 

20 



2-58 

2-45- 

2-41 

! 2-39 

! 

2-37 

2-37 

2*36 

2-35 

2-34 

2*34 

| 2*33 | 

2*33 


Table 6. 1 % points of u = w(m , , n ) 


m 

n 

i 

2 

3 


5 

6 

8 

10 

15 

20 

30 

60 

2 

50-79 

8-93 

5-49 

4-43 

3-93 

3-64 

3-32 

3*14 

2-93 

2-84 

2-7 5- 

2-00(4) 

3 

8-82 

4-34 

3-60 

3-30 

3-14 

3*03 

2-91 

2*84 

2-75~ 

2-70 

2-66 

2-62(2) 

4 

5-42 

3-60 

3-20 

302 

2-92 

2-86 

2-79 

2-74 

2-68 

2-66 

2-63 

2-60(1) 

5 

4-38 

3-29 

3-02 

2-90 

2-83 

2-79 

2-73 

2-70 

2-00 

2-64 

2-62 

2-60(1) 

6 

3*90 

313 

2-93 

2*83 

2-78 

2*74 

2-70 

2-67 

2-64 

2-02 

2-61 

2-59(1) 

7 1 

3-63 

3*03 

2-87 

2-79 

2-75 

2-72 

2-68 

2*66 

2-63 

2-02 

2*60 

2*59(1) 

8 

3-45+ 

! 2-97 

2-83 

2-76 

2-72 

2-70 

2-67 

2-65- 

2-62 

2-01 

2-60 

2-59(1) 

9 

3-33 

2-92 

| 2-80 

2-74 

2-71 

2-68 

2-66 

2-64 

2-62 

2-01 

2-60 

2-59(1) 

10 

3*24 

2-88 

2-78 

2-72 

2-69 

2-67 

i 

2-65 

2*63 

: 

2-61 

2-00 

2-59 

2-59(1) 

12 

3-12 

2-83 

2-74 ! 

2-70 

2-68 

2*66 

2-64 

2*62 

2-61 

2-00 

2*59 

2-58 

14 

3-05 

2-80 

2-72 

2-69 

| 2-66 

2-65“ 

2-63 

2-62 

2-60 

2-00 

2-59 

2*58 

16 

2-99 

2-78 

2-71 

2-67 

i 2-65 4 

2-64 

2-62 

2-61 

2-60 

2-00 

2-59 

2-58 

18 

2-95 

2-76 

2-70 

2-06 

2*65 

2-63 

2-62 

2-61 

2-60 i 

2-59 

2-59 

2*58 

20 

2-92 

2-74 

2*69 

2-66 

| 2-64 

2-63 

1 

2-62 

2-61 

2*60 ! 

2-59 i 

2*59 

2*58 


Note . The numbers in brackets in the column headed m = 60 indicate the number of units which must 
be subtracted in tho second decimal place to obtain the level for m = 120 and the same value of n. Where 
no figure iH given w(120,n) = w(60,n) to second decimal place accuracy. E.g. for tho 2% level 
w(120, 5) = 2*33. 



E. Lord 


65 


Table 7. 0-2 % points ofu = u{m, n) 


\ 

\ m 

n \ 

1 

, 

2 

3 

4 

5 

0 

8 

10 

15 

20 

30 

2 

2540 

20- 1 

9*4 

0*8 

5*7 

51 

4*5- 

4*1 

3*7 

3*0 

3-4(2) 

3 

19-8 

6*8 

5*1 

4*5 

4*2 


3*7 

3*0 

3*4 

3*3 

3*2 

4 

9-4 

51 

4*2 

3*9 

3*7 

3*0 

3*5- 

3*4 

3*3 

3*2 

3-4(1) 

5 

6-8 

4.4 

3*9 

3*7 

3*5+ 

3*5- 

3*4 

3*3 

3*2 

3*2 

3-2(1) 

a 

5-7 

41 

3*7 

3*5+ 

3*4 

3*4 

3*3 

3*3 

3*2 

3*2 

3*1 

7 

51 ! 

3-9 

3*6 

3*5- 

3*4 

3*3 

3*3 

3*2 

3*2 

3*2 

3*1 

8 

4-7 

3-8 

3*5+ 

3*4 

3*3 

3*3 

3*2 

3*2 

3*2 

3*2 

3*1 

9 

4-5 

3*7 

3*5- 

3*4 

3*3 

3*3 

3*2 

3*2 

3*2 

3*1 

3*1 

10 

4-3 

3*6 

3*4 

3*3 

3*3 

3*3 

3*2 

3*2 

3*2 

3*1 

3*1 

12 

41 

3*5+ 

3*4 

3*3 

3*3 

3*2 

3*2 

3*2 

3*1 

3*1 

3*1 

15 

3-9 

3*5- 

3*3 

3*3 

3*2 

3*2 

3*2 

3*2 

3*1 

3*1 

3*1 

20 

3*7 

3*4 

3*3 

3*2 

3*2 

3*2 

3*2 

3*1 

3*1 

3*1 

3*1 


Table 8. 0-1 % points of u = u(m, n) 


m 

n 

1 

2 

3 

4 

5 

0 

8 

10 

15 

20 

30 

2 

507*9 

28*5- 

11*7 

8*1 

0*7 

5*9 

5*0 

4*0 

4*1 

3*9 

3*7(2) 

3 

28- 1 

8*7 

6*0 

5*1 

4*0 

4*3 

4*0 

3*9 

3*7 

3*0 

3-6(1) 

4 

11*8 

6-8 

4*7 

4*3 

4*1 

3*9 

3*7 

3*0 

3*5+ 

3*5“ 

3-4(1) 

6 

8*2 

5*0 

4*3 

4*0 

3*8 

3*7 

3*0 

3*0 

3*5- 

3*4 

3-4(1) 

0 

0*7 

4*5 f 

4*0 

3*8 

3*7 

3*0 

3*5+ 

3*5- 

3*4 

3*4 

3*4(1) 

7 

5*9 

4*3 

3*9 

3*7 

3*0 

3*0 

3*5+ 

3*5- 

3*4 

3*4 

3*3 

8 

5*4 

! 4*1 

3*8 

3*7 

3*0 

3*5+ 

3*5- 

3*4 

3*4 

3*4 

3*3 

9 

5*0 

! 4*0 

3*8 

3*6 

| 3*0 

3*5+ 

3*5- 

3*4 

3*4 

3*4 

3*3 

10 

4*8 

; 4*0 

3*7 

3*0 

| 3*5+ 

1 

i 3*5 

3*4 

3*4 

3*4 

3*4 

3*3 

12 

4*5+ 

3*8 

3*0 

3*0 

1 

| 3*5- 

3*4 

3*4 

3*4 

3*3 

3*3 

15 

4*2 

3*7 

| 3*0 

3*5+ 

! 3*5- 

1 3*4 

3*4 

3*4 

3*3 

3*3 

3*3 

20 

4*0 

3*0 

3*5+ 

3*5“ 

i 3*4 

I 3*4 

1 

1 

3*4 

3*4 

3*3 

3*3 

3*3 


Note . The numbers in brackets in the column headed m = 30 indicate the number of units which must 
be subtracted in the first decimal place to obtain the level for m = 60 and the same value of n. Where 
no figure is given u(0O,n) = w(30,n) to first decimal place accuracy ; u(120, n) = u(60,n) for (i) all 0*2 % 
points except thatu(120, 3) = 3*1 and (ii) all 0*1 % points except that u(120, 2) = 3-4 and u(120,3) = 3-3. 


Biometrika 34 


5 



Table 9. Table for testing the significance of the deviation of the mean 
of a small sample (of size n) from some pre-assigned value 


X a 
n \ 

010 

0-05 

002 

001 

0002 

0001 

2 

3196 

6-353 

15-910 

31-828 

15916 

318-31 

3 

0-885- 

1-304 

2-111 

3*008 

6-77 

9-68 

4 

•529 

0-717 

1023 

1-316 

2-29 

2-85+ 

5 

-388 

•507 

0-686+ 

0-843 

1-32 

1-58 

6 

0-312 

0-399 

0-523 

0-628 

0-92 

107 

7 

•263 

•333 

•429 

•507 

•71 

0-82 

8 

•230 

•288 

•366 

•429 

•59 

•67 

9 

•205- 

•255+ 

•322 

*374 

•50 

•57 

10 

•186 

•230 

•288 

•333 

•44 

•50 

11 

0170 

0-210 

0-262 

0-302 

0-40 

0-44 

12 

•158 

•194 

•241 

•277 

•36 

•40 

13 

•147 

*181 

•224 

•256 

•33 

*37 

14 

•138 

•170 

•209 

•239 

•31 

•34 

15 

•131 

•160 

•197 

•224 

•29 

•32 

16 

01 24 

01 51 

01 86 

0-212 

0*27 

0-30 

17 

•118 

•144 

•177 

•201 

•26 

•28 

18 

•113 

•137 

•168 

•191 

•24 

•26 

19 

•108 

•131 

•161 

•182 

•23 

•25^ 

20 

•104 

•126 

•154 

•175- 

•22 

•24 


The table gives values of the ratio ? = — 0V * at *° n °? 8aTI ^ , ^ mean lying on different levels of significance, 

w range in sample ‘ ^ ® 

the levels being the sum, a, of the two tails of the probability distribution. 


Table JO. Table for testing the significance of the difference between 
the means of two small samples of equal size n 


a 

n 

0-10 

0-05 

0-02 

001 

0002 

0-00 1 

2 

2-322 

3-427 

5-553 

7-916 

17-81 

25-23 

3 

0-974 

1-272 

1-715- 

2-093 

3 27 

4-18 

4 

•644 

0-813 

1047 

1-237 

1-74 

1*99 

5 

•493 

•613 

0-772 

0-896 

1-21 

1-35^ 

6 

0-406+ 

0*499 

0-621 

0-714 

0-94 

103 

7 

•347 

•426 

•525+ 

•600 

•77 

0-85- 

8 

•306 

•373 

•459 

•521 

•67 

*73 

9 

•275- 

•334 

•409 

•464 

*59 

•64 

10 

•250 

•304 

•371 

•419 

•53 

•68 

11 

0-233 

0-280 

0-340 

0-384 

0*48 

0-52 

12 

•214 

•260 

•315+ 

•355+ 

•44 

•48 

13 

•201 

•243 

•294 

•331 

•41 

•46- 

14 

•189 

*228 

•276 

•311 

<•39 

•42 

15 

■179 

•216 

•261 

•293 

•36 

•39 

16 

0170 

0-205- 

0*247 

0*278 

0-34 

0-37 

17 

•162 

•195+ 

•236 

•204 

•33 

•35+ 

18 

•155+ 

•187 

•225+ 

•252 

•31 

•34 

19 

•149 

•179 

•216 

•242 

•30 

•32 

20 

•143 

•172 

•207 

•232 

•29 

•31 


The table gives values of the ratio 


difference between means 


lying on different levels of 


\{w'-\-w") mean of sample ranges 
significance. The levels are the sum, a, of the two tails of the probability distribution. 

N.B. When considering deviations in the positive (or negative) direction only, the values of a at the 
headings of the columns should be halved. 



E. Lord 


Table 11 


n 

d» 

l/d. 


d.Jn 

2 

1*1284 

0-8862 

1*4142 

1*5958 

3 

1*6926 

•5908 

1*7321 

2*9316 

4 

2*0588 

•4857 

2*0000 

4*1175 

6 

2*3259 

*4299 

2*2361 

5*2009 

6 

2*5344 

0*3946 

2*4495 

6*2080 

7 

2*7044 

•3698 

2*6458 

7*1551 

8 

2*8472 

*3512 

2*8284 

8*0531 

9 

2*9700 

•3367 

3*0000 

8*9101 

10 

3*0775 

-3249 

3*1623 

9*7319 

11 

3*1729 

0*3152 

3*3166 

10*5232 

12 

3*2585 

•3069 

3*4641 

11*2876 

13 

3*3360 

*2998 

3*6050 

12*0281 

14 

3*4068 

•2935 

3*7417 

12*7409 

15 

3*4718 

•2880 

3*8730 

13*4403 

16 

3*5320 

0*2831 

4*0000 

14*1279 

17 

3*6879 

•2787 

4*1231 

14*7932 

18 

3*6401 

•2747 

4*2426 

15*4435 

19 

3*0890 

*2711 

4*3589 

16*0798 

20 

3*7350 

•2677 

4*4721 

10-7032 


REFERENCES 

Daly , J. F. (1940). Ann. Math. Statist. 17, 71. 

Davies, O. L. & Pearson, E. S. (1934). J. Roy. Statist. Soc. Suppl. 1, 76. 
Fisher, R. A. (1926). Metron , 5, 90. 

Hartley, H. O. (1942). Biometrika , 32, 334. 

McKay, A. T. & Pearson, E. S. (1933). Biometrika, 25, 416. 

Pearson, E. S. (1926). Biometrika , 18, 173. 

Pearson, E. S. (1932). Biometrika, 24, 404. 

Pearson, E. S. (1936). The Application of Statistical Methods to Industrial 
Standardization and Quality Control. B.S. No. 600. London: British 
Standards Institution. 

Pearson, E. S. & Haines, Joan (1936). J. Roy. Statist. Soc. Suppl . 2, 83. 
Pearson, E. S. <St Hartley, H. O. (1942). Biometrika, 32, 301. 

‘Student* (1908). Biometrika, 6, 1. 

Tippett, L. H. C. (1926). Biometrika, 17, 364. 




r 68 i 


THE FREQUENCY DISTRIBUTION OF V&i FOR SAMPLES OF ALL 
SIZES DRAWN AT RANDOM FROM A NORMAL POPULATION 

By R. C. GEARY 

1. Introductory 

A research on which the writer has been engaged for some years has so far yielded the 
following results: 

(1) Testing for normality has a greater practical importance than statisticians (including 
the writer) have been disposed to accord to it; actual probabilities may be seriously at 
variance with probabilities derived from the well-known tables computed on the hypothesis 
of universal normality; in consequence, testing for normality and, where necessary, 
correction (even if rough and tentative) for suspected universal non-normality, should 
become a part of statistical routine. 

(2) For large samples, ^ and b 2 are the most efficient of large fields of tests of skewness 
and kurtosis, respectively, amongst large fields of alternative universes. 

These matters will be dealt with in detail in subsequent papers. It seems, in the first 
instance, desirable to derive the frequency distribution of yjb x for normal random samples 
of all sizes, partly on account of the inherent importance of the problem, partly in order to 
explore a computational technique which might be found effective in solving the analogous 
but probably more difficult b 2 problem. 

Towards the solution of the problem there are available the exact values of first four even 
moments — the odd moments are, of course, zero — of normal yjb v the second, fourth and 
sixth having been determined by R. A. Fisher (1930) and the eighth by Joseph Pepper 
(1932). It may be useful here to set out the four moments. Taking 

A = m a l m l = »* | .2 ( x i ~ x ) 3 j j {£ ( x i ~ x ) 2 }* , ( 1 ' 1 ) 


where n is the sample number, we have 


6(m — 2) 

tl% ~ (m + ij(m + 3)’ 

108(m — 2) (m 2 + 27m — 70) 

/l * ~ (m+l)(m + 3) (m + 5)(n + 7) (m + 9) ’ 

3240(» - 2) (» 4 + 84w 3 + 2695w 2 - 1 5 1 68n + 20020) 

H ~ (« + 1 ) (n + 3) (» + 5) (n + 7 ) (n + 9) (n + 1 1 ) (n + l3)"(»Ti 5) ’ 

7 . 5 . 3 s . 2*(n - 2) (»« + 1 7 In 5 + 1 3893» 4 + 580401» 3 -5131014m 2 

+ 14132268m- 12932920). 

(H ~ (m + 1 ) (m + 3) (m + 5) . . (m + 1 7) (m +19) (m + 21 )'' 


( 1 - 2 ) 



R. *C. Oeaey 


60 


E. S. Pearson (1931, 1936) derived empirically 0-06 and 0*01 probability points for certain 
values of to > 26 using a Pearson Type VII curve and earlier approximations by R. A. Fisher 
(1929) of the second and fourth moments. 

The method here used for the derivation of the frequency distribution of \Jb t is essentially an 
elaboration of that which the author used ( 1 935, 1 936) for finding the frequency distribution of 
the test of kurtosis a (the ratio of the mean deviation to the standard deviation of the numbers 
sampled), which consisted in establishing a relation in integral form between the frequency 
ordinate for « with the value for (to-*- 1) and thereby determining the ordinates to any 
required degree of accuracy for the lower to’s. At a certain stage the actual frequency is shown 
to be very close to the value based on the Gram-Charlier curve for the same value of w; and 
the assumption is made that the Gram-Charlier may be relied on for values of » greater 
than the ‘transition value’. In the present problem the known normal moments are utilized 
as well at every stage. In the concluding section the status of the solution in the hierarchy 
of ‘ precision ’ is discussed. 

Since the frequency is symmetrical, attention is confined practically exclusively to the 
positive sector. 


2. The general integral iteration 

To distinguish the sample size by the notation let the value of s /b 1 be indicated by t n . Apply 
a Helmert orthogonal transformation to the original observations x v x 2 , ...,x n no that 


x\ = (x l -x 2 )/j2, 
x'n = (x l + x 2 -2x 2 )/j6, 

• 4-1 = (*j + Xj + . . . + ar„_j — to — 1 x n )ly/[n(n — 1 )], 
x' n =■ (x, + x 2 + . . . + xJIJn = xjn, 

which, on inversion, gives 


x, — x = 


x, x, 

V2 + V6 + "‘ + 


v n - 1 


-i)]’ 

X\ X’o x n _ x 

x *~ x - ~J2 + JG + - + J[n(ri^i)y 
2x 2 x' i 

+ - + VN^i )]’ 




X u — X — 


(to — 2) X„_2 


*n-l 


J[(n-l)(n-2)r Jln(n-l))’ 

_ ( to — 1 ) X n _j 
VNto-1)]’ 


»-i 


S (*<-*) 2 = s *' i \ 

1 i-1 


( 2 - 1 ) 


(2-2) 


Then 


(2-3) 



70 


Frequency distribution of ,fb l 


and 


+ 3 **‘ (712 + ^ + • + H>ig) 

<-i \ 

V[(n-1)»]/ 


+3 ^($6 + $b + - + ' 


to n-g^n-l 

V[(»- 1)»] 

_^ 8 _ 2X 8 8 _ 3 ?L_ (*-2X1! 

^6 V 12 V 20 Vt(» -"!)»]' 

Apply a polar transformation to the x' it that is, 

x[ = rsin0 (l _ 3 sin^„_ 4 ... sin^ sin0 o , 
x' 2 = r sin0 n g sin0„_ 4 ... sin cos0 o , 
x'z = r sin$4„_ 8 sin0„_ 4 ... sin oos^, 
a: 4 = r sin0 n _ 8 sin0„_ 4 ... sin^ 8 cos0 g , 


*;_ 3 = rsin^„_s sin^„_ 4 oos& ( _ 6 , 
<-2 = rsin^.g cos<4 B _ 4 , 

*»-i = reoa<J> n _ a , 

Ex'f — Z(x t — x) 2 — r 2 , 


(2-4) 


(2-5) 


and 


( 2 - 6 ) 


<„ = »* {Jjsin 3 0 n _ s sin 3 0 n _ 4 ... sin 3 ^ sin 3 0 o cos0 o 

3 

+ 7l2 8in3 ^- 3 sin8 ^»-4 ••• sinS ^2 S in a ^i cos^i 

+ - + v t( » ~ 2y(» ri]- ] ai " , < t .-. C08 ^-* + v t(«->w“ nV "-* w — 


sin 3 < 6„_ 3 sin 3 $J n _ 4 ... cos 3 ^ 0 --^sin 3 $i„_ 3 ... sin 3 <6 8 cos 3 ^, 

_ ain s <f> coa s 6 ( n ~%) CQ8 3 a \ 

— V[(»-2)(n-l)] 0W - 8 0B “ 4 Vt(»-1)»] ^“T 


(2-7) 


whence the fundamental iteration 

tfl 




(»-“!)* 


sin s $5 n _3 + j 


sin 2 ^ n _ 8 oo 8$5 b _ 3 - cos s $i n _ 3 , (2-8) 


[(» - 1 ) »]» Yn ~* rn - 3 [(» - 1) n]* 

in which there intervenes only the angle <j > n _ 3 ; and for normal random' samples it is a well- 
known fact that the <f> t are distributed independently of one another, the distribution of 
0»-s being of the form Cein n -*<t> n _ 3 d<f> n _ a . (2*9) 

Now t n _ x involves only <j > n ~ 4 ; hence it is independent of 0„_ 8 . Accordingly, if the 

frequency distribution of t n _ x is of the form 

/n-l(^n-l) dt n - 1 , 


( 2 - 10 ) 



R. 0. Geary 


71 


the joint distribution of ^„_ 8 and is given By 

^8in""V«-s#»-*x/»-x(<»-i)*«-i- ( 2 ‘ u ) 

Now, from (2-8), _ ni 

= <(V) ( 2 ‘ 12 ) 

On substituting in (2*11) and integrating we find for frequency of t n the expression 

/»<*») =(^)*|^^/^»-» 8inn ' 4 ^-»/n- l(< »-l ) ' ( 2>13 ) 

where the relation (2-8) obtains. Integration extends to values of <p n _ 3 (so that O<0„_ 8 <7 t 
for «> 3) whioh yield non-zero values of/„_ x . Setting cos^ n _ 8 = x the integral at (2-13) 
assumes the form , . , . . 

tM - (=sr) (*•»*> 

with, from (2 8), nH„_ x = [(n- l)*< n -3x+(»+ l)a^]/(l -x s )*. (2*15) 

In the derivation of the frequencies for n — 4 to 8 inclusive, dealt with in later sections, both 
the forms (2*13) and (2*14) are used. 


3. Functional discontinuities of the frequency 

In the integral at (2*14) t n appears merely as a parameter. Consequently the nature of the 
frequency /„(<„) depends to a considerable extent on the simple algebraic properties of 
t n _i(x) given by (2*15). The following property (easily demonstrated) is fundamental: 


For t n = (n-2k)/[k(n-k)]* = k r n (k = 1,2,...), (3*1) 

< n _i(x) has a maximum value of 

(n - 2k + 1 )/[(* -!)(*- Ar)]» = (3*2) 

for x = -f(n-*)/*(n-l)]* = fc g;, ' (3*3) 

and a minimum value, of 

(» - 2* - 1 )/[*(» -*-!)]* = k^n—l (3-4) 

for x = [k/(n— 1) (»-fc)]* = (3*5) 


Definition. k r n are termed the link values or links of t n . The regions between consecutive 
links are termed zones. The graph of t n _ x {x) for - 1 + 1 and <„ = fc r„ (given at (3*1)) 

is illustrated in Fig. 1. The limits of integration for integral (2*14) are now seen to be k \' n and 
wliich are the values of x at which the ordinates of the curve (2*15) in (x,t n _f), with 
parameter t n = k r n , assume the limiting values + 1 r„_ 1 and — 1 r n _ 1 . The scale on the right 
shows the links of t n _ v The curve t n = k r n traverses all the zones but has a ‘turn’ in the 
(Ic— l)th zone, remaining entirely in the zone the while. It is due to this turn that the 
phenomenon of functional discontinuity manifests itself in the frequency /„(<„). 

Assume that within the kth zone the frequency /„_ 1 (t n _i) is represented by */„_ i(£»_i), 
different in functional form for different values of k but the same (for example, having the 
same coefficients in a power series) within each zone. It will at once be evident, from (2*14), 
that the frequency of t n will have a like property. Now, from (2*4) and (2*5) it will be seen that 

008 % 

h 


(3*6) 



72 Frequency distribution of Jb x 

the distribution pf <j> 0 is rectangular, so that the distribution* of t a is given by 

fM = 7Q(^24y 1 * sl<1/ ^ 2 (3 ' 7) 

and zero for | t a \ > 1/^/2. It follows that t a has a functional discontinuity at its links ± 1/^2. 
Hence, by iteration, the frequency of t„ is represented by different functional expressions in its 
different interlink zones. 


Link of 






That the frequency has a finite limit x r n (when n is finite) is established as follows. It can 
easily be seen from (2* 15) that when t„ = y r n the curve t v _ y = t n _ y (x) degenerates into (i) the 
straight line x = - 1 and (ii) a section above the straight line t n _ y = but touching it. 
For <„> ^no part of the curve = t n _ y (x) falls within the rectangle x = + l,^ = ±iT n _!. 

Reference to (2-14) shows at once that, if/„_i(<„_i) = 0 for | t n _ x | > 1 r n _ 1 , then/ n (f„) = 0 for 
| t n | > jT h . But (3-7) shows that t 3 has as limiting values ± 1/^/2. Hence, by iteration, it 
follows that the limiting values of the frequency of t n (or simpliciter of t n ) are 

±iT n -i = ±(»-2)/(tt-l)*. (3-8) 


* R. A. Fisher (1930). 



73 


R. C. Geary 

As will presently appear, the frequencies for n «* 4 and 6 have marked irregularities: 
successive integration in accordance with (2-14) imparts, of course, a progressively increasing 
degree of smoothness to the frequency. To give mathematical expression to this feature, 
recourse is had to the idea of order of contact. 

Definition. Two functions are said to have contact of order k y n at link k r n if the functions 
and their first ( k y n — 1) derivatives are finite and equal at the link. It can be shown without 
difficulty that 

' kYn = fc-lTn-l+ 1 . ( 3>9 ) 

when k > 1, n > 4. For what follows it will be convenient to set out for the smaller sample 
numbers the values of the links and their orders of contact. The links for positive values 
only of the variables are shown. The orders of contact 1 y„ will appear from a proposition 
proved in § 5, giving the actual values of the frequencies near the limit of range. The non- 
diminishing smoothness in the direction of the centre of the range will be noted. 


Values of k r n and k y n for n = 3 to 8 inclusive 


n 

1st link 

2nd link 

3rd link 

4th link 


l^n 

iVn 


%Y n 

8 r n 

«y« 


l7n 

3 

lly/2 

0 

_ 

_ 

. 







4 ! 

2,73 

0 

0 

0 

— 

— 

— 

— 

5 

3/2 

1 

1/75 

1 

— 

— 

— 

— 

<> I 

4/75 

1 

1/72 

2 

0 

2 

— 

— 

7 

5/76 

2 

3/710 

2 

1/273 

3 

— 

— 

8 

6/77 

2 

2/73 

3 

2/716 

3 

0 

4 


For even values of n the origin is always a link. In the determination of the frequencies 
for n — 5 to 8, by the methods described in subsequent sections, the link ordinates and the 
central ordinate play a cardinal role. In fact, the method will be seen to consist essentially 
in finding curves which pass through the central and link ordinal points, have the required 
orders of contact and the required form at the limit of range and have the exact earlier 
momental values (see first section). 

4. The frequency near the centre of range 
It will first be shown that 

/; t ( + 0) = 0 for n > 4. (4-1) 

In fact, from (2-14) and (2-15) if <„ = u, a small positive quantity, 

/•+A-ku=A* 

/» = c dx(\- 

J - A — A*u — A ' 

A , k being positive constants. Hence 

/» = - 0/c{(l - A'*)«~-’>/ n _i[* w _ 1 (A')] - (1 - 

+ C" J* d*( 1 - 




74 


Frequency distribution of Jb x 

Letting u->0 the integral-free expression obviously vanishes provided that/ n _ 1 [f n _ 1 (A)] is 
finite, which it is when n > 4; and the integral becomes 

Since /„_i(y) is an even function of y, its derivative is odd which remains an odd funotion 
when y is replaced by an odd function of x. Hence the integral vanishes. 


5. The form of the frequency at the limit of range 
In this section it will be shown that near t„ = ±(n — 2 )/(n — 1 )* the frequency is given by 

1 i(ra-3)! (n-l)‘<"- 3 > in- 2* 


/»(«■>- ilE- 




3 y/n \(n - 4) ! ( 3 W . n _ 2)«*~ 4 ) 

It may be seen at once that for to = 3 the frequency by (5-1) would be 


(5-1) 


Uk) « -(*■ 

7 T 


(5-2) 


as at (3-7). For n — 4, (5-1) gives | ^3, which is the value found by A. T. McKay (1933). 
The general theorem will be proved by iteration. We assume a general form 

In — 3 2 \ Un-5) 

fn-x(tn-l) = Cn~l[ n _ 2 -tl- 1) , («*S) 

and show that a similar form emerges for f n (t u ), finding incidentally an iteration relation for 

the constant C n . First set 

v = n — 2 — n — l*^m> 

and assume that v is a positive quantity. It will readily appear, from (2*15), that, for v = 0, 


t n _i(x) has a double root at x = l/(n- 1). Accordingly we set 


x = x' + l/(n— 1) 

(5-4) 

J v (w— 3) 2 2 

“ d 

(5-5) 

Having regard only to principal terms we find 


! X 2^* n ( n ~ 2 ) 
x ~ (w — 1)* ’ 

(5-6) 

X^ln-\n - 1 ) 3 (n - 2)~ 2 (n - 3) ( 1 - x" 2 ) vK 

(5-7) 

with *■ - (f=f 

(5-8) 

Now, from (2*14), 

f <” - 4li 1 - '• ' 



and, from the analysis in § 3, it will be clear that there are two separate parts of the domain L ) : 

(I) a part near x = 1 /(n— 1) for which < n _ x is entirely in the first zone and by hypothesis 
has the form (5-3); 

(II) a part near x = — 1 in which t n _ x assumes all values. 


* The symbol — signifies ‘equals, to required approximation’. 



R. C. Geaby 75 

Let f n (t n ) =m n )+f?(t n ), (5-9) 

where the functions on the right represent the contributions accruing from the respective 
parts of the domain of integration. Then 

x (n-2)- 2 (n-3)(l-*' a )v*}‘<«-« 

which, on a change of variable from x to x " by (5-4) and (5-8) and integrating in x", becomes 

3 "* ^ n ~^~^- n ~ Un ~ a) ^ ~ 1 )”“ 3 ( n ~ 2 )" <n_4) (*» “ 3)* <n “ 6) (^4 - ' 

’ ' ( 6 * 10 ) 

which is of the required form. As regards the contribution of II, in (2*15) set 

x + 1 = fiv 4- yv*. • (6-11) 

It will be found that when t n _ t has its limiting value + (n — 3 )/(n — 2)* the vanishing of the 
terms in v* and tP gives 

, , , 2/2 (n — 3) 

“ ,,d 

whereas if t n _ x has its limiting value — (n — 3)/(w — 2)* the values are 

0 = ln .id 

Between the limits of x, 

v 2 12 n — 3 
+ 3»“ 9y 3 ti 2 (»-2)* 1 

t n _ i( x ) given by (2-15), assumes once all values between its limits of range and, in fact, 


/2 J>- 3) 
3» a (n — 2)*‘ 


(5-12) 


Now 


AHtJ = 




i(n- 3)! 
Jnf( ra-4)! 


/ 2<„-x 

/3»* ' 

* 

(513) 

1 

. dx( l-x 2 )«"- 7 >/«-i(<„-i). 

i 

(5-14) 


the limits of integration in x being given by (5* 1 2). By (6- 11) and (5*13) change the variable 
x into (via y) when (5-14) becomes 


A l (0 - C(n) 

and the integral on the right is unity. Written in full (5-15) then becomes 


/»(<„)- Q 1 . 


-3)! (»-!)“« ®> In- 2* 


3 v /ffn|(n-4)!(3 nn _T2)*< i 


L i n -rl -A 


(5-15) 


(5-16) 


There iB no difficulty now in proving by iteration from (5-10) and (5-16) that the constant 
has the form indicated in (5-1). Note that, in (5-9), f 1 accounts for (n— 1 )/n and / n for 1/n 
of the total frequency. 



76 


Frequency distribution of Jb l 


6. Samples oe 4 

From (2-14), (2-15) and (3-7) the frequency for n = 4 is found to be 


Uh) = {I j J} dxyi ' 


where y = 2(1— x 2 ) 3 — (v — 3x+5x 3 ) 2 with v — (6-2) 

D is the range of values of x which give non-negative values for y with | x | 1. Now 

y = - (3x 2 - 1 ) 2 (3x 2 - 2) - 2v(5x 3 - 3x) - v 2 , (6-3) 

from which it appears that when v is small y has two real roots near — 1 /^/3, two imaginary 
roots near* + 1/^/3, and single roots near +^2/^3 and — *J2/-J3 accounting thus for all six 
roots. With 0(v J ) = 0 the four real roots are 

-p ±m ‘ wi,h “*-i^ 

12 v 
± V 3 »' 

Hence the integral at (6*1) may be written as the sum of five integrals 


f-VI+e/9 , 

f-l/VS -avi 

rily/3-avl 

/* 1 /V3 + ar^ 

rvi-v/a 

+ 

[ + 

1 + 

+ 

1 

< 

T 

CD 

-Vi+w/9 J 

-1 * 

1 1/V3 — ai>* 

J l/V3 + av * 


Fig. 2 illustrates the division of the region of integration D. There are five divisions, 
numbered I-V, in what follows, in which we regard as ‘principal terms’ only those in logv 
and the constant term. Terms in v i will ultimately be ignored: 


i = 

r-v 


J -VI-U/9 

n = 



J -VUvIQ 

ill = 

r + \IV3-*vt 

J -1 ly/S-av* 

IV = 

j* l/V3 + at)i 

* 

* 1/V3~at)l 


V> *= Jdx( - 1 + 3x 2 ) -1 (2 - 3x 2 )~* 

“53S { - 4 (i® lo e 


1/Va-f avi 


=£= 12"* J (x' 2 + l)~*dx f = ~gSinh~ 1 1. 


Neglecting 0(v *) we accordingly have 
I + II + III + IV+V = 


^3 *° g ^ a2v + J3 sinil1 1 + 


The constant 2 C derives from additional terms in integrals II, III and V : 

z 

f+r * > 

TTT = I dtr .( M — — ' , ln( fir 3 — 


III = J* dx{ ( 1 — 3x 2 ) 2 (2 — 3x 2 ) — 2v(5x 3 — 3x) — r 2 }“ 1 
y = 1/^3-av*. 

* In a sense which will be obvious from Fig. 2. 



R. C. Geaby 77 

We have already taken account of the term in III found when v is zero. The constant C 
derives from the even powers of 2 in the formal expansion of the denominator of the integral 
element — the odd powers vanish by symmetry. Setting 

* = 1-3** ^2^/3 *', 2-3**^l, 5**-3*^ -|^/3. 

On expansion of III, 

® /•l/v'3 - 

c** 22 d*'*'-4*-i (2v)« (2 V3)-**- 1 C ik , 

k**XJ acl 

where C u are the even-order coefficients in the expansion of (1 + 2 )“*, i.e. C tk is the coefficient 
of 2 **. On integration we are interested only in the value at the lower limit av*, for all 
terms at the upper limit (and certain terms at the lower limit) are 0( v*) at least. Hence 



Nolf.. Thin diagram in designed merely to give a general idea of the limits of integration. It is not 
drawn to any scale. Following are the values of the £ : 


£1 = 

£ = V§+<’/9 
ir = vs - «/» 


£, = 1/V3 

& = l/vft + ai’ 1 
£= I/V3-aid 


| a* = 2/(9 V3) 


Similarly II + V also yield C giving a constant additional term of 20. 

Now 0 - Ja !,% - <tli,+c *** + - 1 


1 f 1 dx 1 C 1 dx (/ , ov 

“ ys.l a, + 2^J „ i Ui + 0 -**>-» 

“ ,/3 L ~ ° 8 * ° 8 (i +?)>'+ 1 * 08 l + <r-Vr»r*J 


x<=0 


1 


= j3{l° g 2+Jlo g (2l- 1 )}. 


(6-5) 



78 Frequency distribution of Jb x 

All logs are to base e (unless otherwise indicated in what follows). Henoe 

fAh)— 0- 372646 log v (6*0) 

= 0*285222- 0*366406 log 10 |< 4 |, (6*7) 

since v = 3 t v 

A. T. McKay (1933), from a different approach, gave the log term in (6*7), and, as a rough 
approximation to the constant term, the value 0*31 1568. He also showed that an expression 
of the form (6-7) accounted for most of the frequency, a fact of great importance. Assume 
that the residual term is of form 

A |f 4 |* + *fi|^|> 

and find A and B from 

(i) ft(2/yj3) = £^3 (from (5*1); also McKay (1933)), 

(ii) total frequency is unity, 

giving 

fM = 0-285222- 0-366466 log 10 1 1 4 | -0 009178 | * 4 |* + 0-031359 | t A |. (6-8) 

For algebraic manipulation at the next stage the form of residual A' | | 4* B't\ will be found 

more convenient, however, with A ' and J5' also determined from (i) and (ii). In this form 

f&i) = 0-285222 -0-1 591 55 log | J 4 | +0-014275 | t A | + 0-007398**. (6-9) 

Note the smallness of the coefficients A , B, A' and B ' in (6-8) and (6*9). 

In the following table the first four even moments as derived from frequencies (6-8) and 
(6-9) are compared with the actual values as derived from the formulae ( 1-2) . Both formulae 
yield excellent approximations, with (6-8) always superior to (6*9) however. Either formula 
can obviously be used with complete confidence for deriving the probability points. The 
frequency graph in Fig. 4 is derived from (6-8) which should also be used for the computation 
of the probability points. 




Formula 

Moment 

Actual 







(0*8) 

(«•») 

P% 

0-342857 

0-342930 

0*342470 

Pi 

0-258941 

0-258979 

0-258606 

Pi 

0-240503 

0-240263 

0-240205 

Pi 

0-245940 

0-245949 

0-246662 


7. Samples of 5 

After many computational experiments the method used for determirting the frequency 
f 5 (t 5 ) was as follows: 

(1) Using (2-14) with form (6-9) for / 4 (* 4 ), central and link OFdinates, i.e./ 5 (0) and / 6 ( 1/^/6) 
were computed. 

(2) The approximate value of f 5 (t 5 ) near t 5 = 1/^/6 + 0 was found in the form 

f s (l/j6) + M(t 5 -l/j6)i, 

M being known. 



R. C. Geaby 


79 


(3) The two zonal curves were found (i) passing through (0, / 6 (0)) and (l/^/0,/ 8 ( 1/^/0)) 
with/ 8 (0) = Oand(ii) passing through (1/^0, / 6 (l/^0)) and with the required form at l/^0 + O 
(i.e. as at (2) above) andat the limit (f - 0) so that /*„ ( = 1), and have the exact values 

as given for n = 5 by the formulae (1*2). 

Setting then 

/ 4 (*«) = 0*285222-0*159156^ \t t \ + (7*1) 


with 

and 


_ 0^ (a»-frE + fr 6 ) 

4 J5 (1-a 2 )* ’ 

R(t t ) = 0*014275 |* 4 |+ 0-007398* 2 , 


(7*2) 

(7*3) 


we have 



(7*4) 


the limits of integration being A (negative) and /i (positive) which are the values of x, from 
(7*2), corresponding respectively to t A = - 1 ^3 and t A = + £ ^3. We shall be concerned only 
with the case t 6 < 1 /^/0 when ( A (x) has three real roots /J, a and y of which ft is negative and a 
and y are positive. For (7*4) the following are required 


dx /I +/t 1 - A\ 

p'dalog(l —a 2 


(7*5) 


(1-a 2 ) 


= iOog 2 (!+/»)- log® ( 1 + A) - log 2 ( 1 - //) + log 2 ( 1 - A)} 
+ £{!og ( 1 + A •) log ( 1 - n) - log ( 1 + A) log ( 1 - A)} 

+ log 2 log (|:^) + ^( l 7)- 2 A ) • 


(7*6) 


y dx . 

j-^log 


2 + 3 


= Jlog(|_^log(l-a)(l-/?)(l-y) 


with 


- •> log log (1 + a) (1 +/}) (1 + y) + * j I(k 4 ) + S J(^)j, 


*, = (a- A)/(l -a). 
k 2 = (//-a)/(l + a), 
*s = (y — A)/( 1 — y), 
k 4 = (//-y)/(l + y). 


k 5 = (/?-A)/(1-/?), 
*6 = (//~A)/(1 +/?), 

*7 = (a — A)/(l + a), 
*s = (/< — a)/(l — a), 


*» = (y-A)/(l+y), 
*m = (/'-r)/(i-r)> 
^u = (A-A)/(l+/?), 

*12 = (/* -/?)/(! -A 


/(*) = f ^ = log a log (l+/c) — yK*), 

Jn H.r 

•J(*) = C o = -log* log (1 -K)-</>(K), 


(7*7) 



K* 

2* 


+ 



^(*) 


i 2_ 2 2+ ar 2 " "j 


when k ^ 1 . 


and 



80 


Frequency distribution of Jb x 

It is useful to note that 0(1) = 1*644934 = 20-(l). The functions 0 (k) and fr(ic) do not 
appear to be tabulated. By fitting curves to their values for equally spaced intervals of 0*06 
from 0 to 0*6 the following very close approximations are found, applicable for * < 

0(*) = 1*000667* + 0-233464** + 0*186062*®, 

0(*) = 0-999836* -0*244220* 2 + 0-077024* 8 . 

When 1 the following formulae oan be used: 

0 (* ) = 0 ( 1 ) - log K log ( 1 - K) - 0( 1 - K), 

l/r(K) = £0( 1 ) + log K log (1 + AC) — 0(1— /c) + $(l — K 2 ). 

When k> 1 we use 

0(K) = 20(l)-£log*l//c-0(l//c), 
i/t(k) = 20*( 1 ) + £ log 2 1 /k - 0( 1 /k ). 

Another useful formula is 

0(*) = 0(k) + £0(* 2 )* 

The algebra of the contribution to (7*4) from R{( t ) is without mathematical interest. 
From the formula the following were the values found for the central frequency and the 
second link frequency: 

/ 6 (0) = 0*606563; / s ( 1 /V 6 ) =0*699069. (7*8) 

The moments, computed by (2*1), with n — 6, are 


fi 0 = 1, /t 2 = 0*375, fi 4 = 0*361607, //„ = 0*474609, ^ = 0*719382. (7*9) 

Computation by approximate integration of certain of the ordinates gave evidence of 
marked irregularity near the link l s = 1/^6. In consequence, it seemed desirable to try to 
find a term (in addition to the constant given at (7*8)) of the expansion of / 5 (< 5 ) near 1 /^/6 + 0. 


Setting 


* S- V 6+ *’ 


(7*10) 


where t is small and positive — we shall be interested only in a term in — we find 

a ( pr-At^ dr 

, A7 + L Jr~** A(<4) - 

The values v ± Aft are the abscissae of the points at which the curve l t — t 4 (x), given by (7*2), 
intersects the i 4 link line t x — 2/ N /3 near * = v = — \ ^/6 . It can easily be shown that 


(7*11) 


A* = 




12 ' 


(7*12) 


We are not concerned with the values (A + A't) and (/i+/i't) which are the abscissae corre- 
sponding to the intersection of t 4 = t 4 (x) with t 4 = - 2/^/3 and its third intersection with 
t 4 — 2/^/3. Remembering that at the latter link f 4 (t 4 ) has the value 1/(2 v /3), the integral- 
free term in M in the first derivative /s(< B ) of / 5 (< 5 ) is 

^/5]1^{/4(«-*4<‘)x -^At~i+Uv + Aft)x - \At -»} = (7 ’ 13) 

Also we have to consider the integral term in/ B (< 8 ). For this purpose, from (7*10) and (7*2), 



R. C. Geary 81 

Remembering (7*1), it can be shown that the only term in (7*11) from which a term in £* 
can come is approximately 


A and //, the limiting values, being respectively negative and positive. 
Differentiating (7*14) in respect of i we find 

v« »/ »i i » n -j> vi' 

Changing variables by x — ^ 

and letting t tend towards + 0, we find for the term in t * 


(7-14) 


(7-16) 


Adding (7*13) and (7-lfl) we find 


?t 2 V 5 \ 5 /' 

-?5^2*3-*r». 


(7-16) 


(7-18) 


On integrating we find for the term in t h — j 

- W,r«(< t - A)‘ - -0 «94117(« t - i 6 )*. (7-17, 

From (5*1) the value of f h (x) near x = | is 

0*21916G(f — a:)*, (7*18) 

where x is usually written for simplicity instead of t b in the remainder of this section . 

Having regard to (7*8) and to the fact that, from § 4,/g(0) = 0, in the half-zone (0— 1/^/6), 
f h (x) = F(x) must be of form 

F(x) = 0*606563 + a 2 # 2 + tt 3 a; 3 + a 4 a? 4 . (7*19) 

The first relation between the coefficients is found by giving expression to the fact that 
y = F(x) passes through the link-point (1/^/6,0*599069): 

0*166667« 2 + 0*06804 la 3 + 0*02777 8a 4 = -0*007494. (7*20) 

[n the zone (l/ x /6 — |) assume that 

M*) = 0(jt) = — 0-594117(;r — + 0 , 219166(§ — a;) 1 + 6 0 + 6j(f — a:) 

+ + (7*21) 

lesigned to conform with requirements (7* 17) and (7*1 8). Since y = (?(.r) must pass through 

0 ), / 1 \ 1 

5 0 = 0*594 H7^| — J = 0*620775. (7*22) 

Taking the value of 6 0 into account and giving algebraic expression to y = (?(#) passing 
through (1/^/6,0*599069), we find 

l*0917526 1 -f 1*1919226 2 + l*301283fe 8 - -0*250706. (7*23) 


/»(■*) - 0(jr) 


•620775. 


(7*22) 


Iiiometrika 34 


0 



82 


Frequency distribution of Jb x 


To find the six coefficients a 2 , a B , a l (in (7*19)) and 6 X , 6 2 , b B (in (7-21)), we have, bo far, found 
two equations, (7-20) and (7-23). The remaining four equations are found by equating the 
total frequency to unity and the first three even moments to their true values given at (7-9), 
i.e. setting 


riive r i 

\fi ik = dxx 2k F(x) + dxx^Gix) (l = 0,l,2,3). 
J o J live 


On substituting for o s given by (7-20), for b 0 given by (7-22) and for 6 S given by (7*23), we 
find the four equations in a 2 , a 3 , b x , 6 2 : 

0-0 2 90721a 2 + 0-0 2 139999a 8 +0-2979816 1 + 0-10844l6 2 = -0-071173/ 

0-0 3 64802a 2 + 0-0 3 110237a 8 + 0-2683326 1 + 0-0825906 2 = -0-066293, 

2 8 1 2 (7-24) 

0-0 4 6901o 2 + 0-0 4 107175a s +0-3032486 1 + 0-0790866 2 = —0-076019, 

0-0 6 6368a 2 + 0-0 6 1169a s + 0-4000906! + 0-0896746 2 = -0-101300. 


On solution (and checking by substitution) the coefficients are found to give finally the 
following frequencies: 


Zone f 5 (x) 

0 - 1/^/6 : 0-606663 - 0-3307a: 2 + 3-1 956x* —6-11 29a; 4 , 

l/V6-f : 0-620775 - 0-5941 1 7 (x - 


J 6 j + 0-219166(f - x)i - 0-268273(| - x) 


+ 0-067263(f - x ) 2 - 0-029195(| -xf. 


(7-25) 


with x = < 5 . 

The extremely interesting form of the frequency curve may be observed from Pig. 5. In 
the first half-zone the frequency shows but little variation: the curve declines to a minimum 
of 0*6058 at x = 0*0894 then rises to a maximum of 0*6136 at x = 0*3027. It then recedes to 
the link 1/^6, where it assumes the value 0*5991. As one type of check on the reliability of 
the results in general, some ordinates were computed directly (i.e. using (7*4) and (7*1)), 
or by approximate integration using (7*4) and (6*8) and compared with the ordinates 
computed from (7*25) to the following effect: 


Trial value 
of t t 

Value of frequency 

By approx, 
integration 

By (7-25) 

0-15 

0*6069 

0*6068 

0*3 

0*6106 

0*6136 

0*6 

0*3650 

0*3603 

0*9 

0*2232 

0*2308 

1*2 

0*1377 

0*1371 


Except perhaps for the frequency at t b = 0*9, the correspondence is satisfactory; there can 
be little doubt that the more accurate figures are those from (7*25). 

As a stringent test of the accuracy of the frequency the 8th moment /4 was computed 
from the empirical curves at (7*25) and compared with the actual value given at (7*9): 

fi b = 0*7191, fi b = 0*7194, 




R. C. Guaby 83 

Even as the figures stand the check is decisive: it should be added that the 4th place of 
decimals in /i' e is suspect to the approximation used. 

8. Samples of 6 

In this case the links are 0, 1/^/2 and 4/^/5, and the link frequencies at the first two were 
found by approximate integration using form (2- 13) with t % given by (2*8). For this purpose, 
drawings were made of the two sections of f 5 {t s ) on a scale sufficient to ensure that an 
ordinate read for any abscissa would be correct probably to the 3rd place of decimals. For 
intervals of 1°, values of t 5 were computed over the whole range by (2-8) (for t t given), and 
graphically the value oif b (t b ) was read off* for each t b . Hundreds of readings had to be made, 
but actually the work, with a little practice, was rapid and accurate, the entries being prac- 
tically self-checking. The Gregory formula (using 2 correction terms) was used to give the 
following results: /„(0) = 0-0889; /„( 1/^2) = 0-3247. (8-1) 

The two zonal frequency curves, say y = F(x) in (0— 1/^/2) and y — 0(x) in (1/^2 — 4/^/5), 
writing x instead of < 6 must have the following properties: 

(i) F(0) = 0-6889, 

(ii) F'(0) = 0 (§4), 

(iii) F(lly/2) = <?(1/V2) = 0-3247, 1 (8-2) 

(iv) F’(lly/2) = <?'(1/V2) (§3), 

(v) G , (a:)^6(4/ > /6-a:)/36 (from (5-1))., 

The curves were 

F(x) = O-eSSfl+Og^ + aga^ + ^a^ + agX®, 

0(x) = fc(fi - x) + b 2 (/3 - xf + b 3 (/3 - x) s + b^fi - x)‘ with ft = 4/^/5. 

The exact moments are 

H Q =1, /v 2 = 0*380952, /i b = 0*409191, // 6 = 0*642924, fi s » 1*219892. (8*4) 

It is proposed to compute the seven coefficients in (8*3) using (8*2) and (8*4). Now, with 
a curve of the type of f b (x), where much of the frequency is at the ends it is evident that the 
contribution from the zone (0 — 1/^/2) to the higher moments fi b and/* 8 is exceedingly minute: 
this property was utilized to divide the single series of seven equations into two series of 
three and four equations using the following device: Approximate F(x) by a curve F x {x) 

8 ivenb y F t (») = 0-6889 + a.' **, (8-6) 

finding a' 2 simply by passing y = F x (x) through ( 1 /y'2, 0-3247) giving a* = — 0-7284. If is 
the moment, let /i 2 „ and //*„ be the contributions from F(x) and G(x) respectively so that 
Hit = fi'te+fit g . Let v 2ll be the estimate, using F x (x), of For s — 2, 3, 4 the values are 
r 4 = 0-030318, = 0-009360, i>' = 0-003397, 

which, subtracted from the corresponding /ip, given by (8-4), give very close estimates of 
//*„, which involve only b 2 , b 3 , b i . The equations in order 

(1) 1-1701786* + l-2658376 s + 1-3693166* = 0-174500/ 

(2) 0-7269806* + 0- 3789496 3 + 0-2355026 4 = 0-058771, • (8-6) 

(3) 1-2600676* + 0-5347026 s + 0-2896636 4 = 0-091217, 
are found from (iii) at (8-2), from and from (i\. 


(8-3) 


6-2 



84 


Frequency distribution of Jb x 

The equations in the a’s are # 

(4) 0 -5a 2 + 0*353553a 3 + 0*25a 4 + 0*1 76777a 5 = -0*3642, 

(5) J*414214a 2 -f l # 5a 3 + l*414214a 4 + l*25a 5 =-0*653179, 

(6) 0*117851a 2 + 0*0625a 3 + 0*035355a 4 + 0*020833a 5 = -0*115790, 

(7) 0*035355a 2 -f 0*020833a 3 + 0*01 2627a 4 -f 0*00781 3a 6 - -0*031433,. 

where (4) is from (iii) at (8*2), (5) from (iv) at (8*2), (6) from the total frequency = i and 
(7) from variance =/v 2 . 

The solutions of (8*6) and (8*7) yield the following frequencies: 

Zone f 6 (x) 

0-l/^2 : 0*6889 - 1 *27 1 5a: 2 - 2*6073ar* + 9*5669o: 4 - 6*7790:** 
i/72 - 4/75 : & (fi- x) + 0*0470680?- x) 2 + 0*024897(yff - x f + 0*064 1 98(/? - x)\ 

with /? = 4/75, 

with x — t 6 . 

As a check, the 4th moment computed from the foregoing curves gave 0*4108 as compared 
with the actual /^ 4 = 0*4092, an error of 0*38 %. This is not of any importance from the view- 
point of the computation of the probability points, but it illustrates how, using the integral 
iteration method as generally in this paper, the momenta! check reveals increasing dis- 
crepancies with increasing n. 

Tt might bethought that by constructing empirically ‘ almost any ’ symmetrical frequency 
curve, so that say the 0th, 2nd and 4th moments have the true values, we shall ensure that 
the subsequent even moments computed from such an empirical curve will approximate 
closely to the corresponding true values. That this is not the case may be seen bv computing 
the 6th and 8th moments by the well-known Karl Pearson iteration formula,* where // a 
and /i A have their true values, for <Jb 1 with n — 6 : 




Actual 

Karl Pearson 

Percentage 


iteration 

discrepancy 

A — /*«//** 

11-6291 

12-4984 

+ 10-7 

A« = Nh l \ 

t 

57-9214 

73*3990 

I 

+ 26-7 


Even when, in the Pearson iteration for /? 6 , one gives /? 4 the correct value, we find a per- 
centage discrepancy of 17*9. These percentages place in perspective the minuteness of the 
percentage errors found in using the higher momental check as it is used throughout this 
paper. 

It is an interesting question of general import whether in work of this kind the arduous 
and potentially erroneous computation (by integral iteration) of the central and link 
frequencies could be dispensed with, and reliance placed entirely on the moments, together 
with the functional properties of the frequencies, which, of course, merely represent an 
elaboration of the Karl Pearson approach. In this connexion a couple of experiments were 
made on the 7&1 frequency forn = 6. 


* Tables for Statisticians and Biometricians , Part 1 , 2nd ed., p. xi. 



R. C. Geary 


85 


For the first experiment, the two zonal curves were assumed to have the correct order 
of contact, the correct form, (8-2) (v), at the limit of range and the correct values of 
/i 0 ( = 1), /* 2 , fi t and /i 8 . The equations are 

Zone 

O-l/y/2 -.F^x) = 0-659844- 1 075618x 2 + 0-555991x®, \ 

l/^2-4/V5 :<?!(*) = 3 6 e(A-a:) + 0-080560(/y-a;) 2 -0-085469(ytf-x) 3 l (8-9) 

+ 0-133119(/y-x) 4 with /3 = 4/^/5. j 

This gives a central frequency 0-6598 compared with the computed frequency (by (8*1)) of 
0-6889. In all the circumstances the difference is not important. The 8th moment, /4', from 
(8-9), is 1-217706, or -0-18% in error. 

The second experiment contemplated the frequency as a single-curve system with correct 
first derivative ( — at the limit and with correct / 1 0 , /* 2 , /i 4 , fi 6 . The curve is 

4\(x) = 0-669426- 1-510972*+ l-53854x 8 -0-60545;r 4 + 0-085157r\ (8-10) 

which has the properties : (i) the central ordinate 0-6694 is close to the actual; (ii) limit value 
from curve scarcely differed from the actual since F 2 (4j<j5) = 0-0015; (iii) from curve 
= 1-2237, an error of 0-31 %. 

All the systems (8-8), (8-9) or (8-10) yield probability points which differ very little. For 
instance, in the three cases, the 5 % point is given by 


System 

5 % probability 

| (8-8) 

1 0432 

(8-9) 

1 0385 

(8-10) 

1 0384 


The practical identity of the latter two is due to the fact that the frequencies were derived 
on very similar hypotheses: it does not mean that the result is more reliable than that from 
(8-8) wdiich, assuming the accuracy of the calculation of the link ordinates, must be deemed 
to be the most correct and is adopted for the iteration to the n = 7 stage. Nevertheless, these 
experiments convey the hint of general application that if we know (i) a number of moments, 
(ii) the limits of range and the frequency form at the limits of range, and (iii) that the amount 
of frequency near the limits of range is not negligible, we will probably be in a position to 
estimate with fair accuracy the points of low probability. For this, however, hypothesis (iii) 
is essential: it has no value from the computational point of view if the frequency near the 
limits is negligible. This point is discussed further in § 10. 

9. Samples of 7 

The functional properties of the curves at the stage are as follows. Let the three links be 
denoted by a, /?, y, so that 

a = 1/^12, p = 3/V10, y = 5/^6. (9-1) 


Denoting t- by x, set 


y = x — a, z — y — x. 


(9-2) 



86 


Frequency distribution of Jb x 

and let the curves in the half-zone (0- 1/^12), and in the zones (1/^1 2 — 3/^10) and 
(3/^10 — 5/^/6) be denoted respectively by F(x), G(y) and H(z). We then.have 

(i) JF’(O) = 0-6781 = A, 

(ii) F'( 0) = 0, 

(iii) F(x) = 0( 0) = 0-5870 = B, 

(iv) F'(oc) = G'( 0), I >3 

(v) F"(a) = G"(0), [ 

(vi) (?(/?-«) = H(y—fi) = 0-1838 = 0, 

(vii) G'(fj-a) = —H'(y— ft), 

(viii) H(z) = Z>z 8 +c 2 z a + Cj ) z 3 with I) = 0-078091., 


The central and link ordinates A, JB and C at (i), (iii) and (vi), were derived by the Gregory 
formula from (2-14), using intervals of 0-01, 0-025 and 0-05 at different sections of the 
integral range. The equalities in the derivatives at the links are in accordance with order of 
contact requirements (§3). The first term on the right of (viii) is from (5-1) with n = 7. 

Conditions (9-3) determine the form of the polynomials: 


F(x) = A +o 2 x 2 + a 4 a^, 

G(y) = B+(2a2X + 4a t oc 3 )y + )t(2a 2 +l2a i a 2 )y 2 + b< i y 3 + b t y*, 
H(z) — I)z* + c 2 z z + c 3 z 3 , 

with x — < 7 . 


(9-4) 


F(x) is taken as an even function of x because it is symmetrical in the zone ( — 1/^12 to 
4- 1/^12). This should have been done in the case of n = 5; neglect to do so was not serious 
enough to render recalculation necessary. 

The moments used were: 


fi Q — 1, //, 2 = 0-375, // 4 = 0-421875, /< 6 = 0-733487. (9-5) 

Using (9-3) in conjunction with /i 0 and /t 2 (only) in (9-5) the following equations in the six 
unknowns a 2 , a 4 , b 3 , b t , c 2 , c 3 were found: 


Eqn. 

Loft: coefficients of 

Right: 

absolute 

term 

no. 

a t 

«4 

&3 

b 4 


^3 

1 

12 

1 





— 13*112496 

2 

0*816667 

0*281315 

0*287507 

0*189756 

— 

— 

- 0*403210 

3 

— 

— 

— 

— - 

1-193685 

1*304172 

0*094626 

4 

1*897366 

0*756233 

1*306833 

1-150028 

2-185118 

3*581055 i 

- 0*122438 

5 

0*229605 

0069278 

0*047439 

0-025048 

0-434724 

0*356221 

1 - 0*122152 

6 

0*130637 

0*041871 

0*032192 

0-017835 

0-608438 i 

1 

0*496634 

- 0*044365 


Approximations to F, G and H were found: 

F x (x) = 0-6781- 1 -238888a: 3 + 1-754160X 4 , 

G t (y) = 0-6870 — 0-546478y — 0-361 808y 2 + 0- 171 129y 3 4- 0-3471 63^*, 
H x (z) = 0-078091z»-0-017319z 2 + 0-088408z 3 . 


(9-6) 




R. C. Geary 


87 


These yielded estimates of the 4th and 6th moments as follows: 

/i' t = 0-419712, = 0-720776, (9-7) 

differing by — 0-6 % and - 1 -7 % respectively from the correct values at (9-5). These devia- 
tions were not serious from the viewpoint of probability-point determination. Nevertheless, 
it seemed worth while to try to achieve a closer approximation. This was done by finding a 
‘corrector’ <j>(x) (not positive, like a frequency, for all values of x) with the following pro- 
perties: 

(i) total ‘ frequency ’ zero, \ 

(ii) ‘2nd moment’ zero, I ' 

(iii) <f>'( 0) = 0, 1 (9-8) 

(iv) <f>(y) = (j>\y) = 0, l 

(v) ‘4th moment’ = /t 4 — /i' t = 0-002163 ./ 

Then </>(x) = 0-002404 - 0-028853x 2 4- 0-046232x 3 - 0024236a* + 0-004342S 5 , (9-9) 

and the frequencies finally adopted are 

0-1/V12 ... F(x) = F l (x)+ <t>(x), 

1/V12-3/V10 ... G(y)^G l ( y) + <t>(x)\ (9-10) 

3/V10-6/V6 ... H(z) = H^ + frx),. 

F 1 , Oj and H x being given by (9-6) and x — < 7 . It is evident from the smallness of the coeffi- 
cients of <j>(x) in (9-9) that the correction effected by <p(x) is minute. From (9-10) the moment 
fi\ is 0-728972, so that the error is reduced to about one-third of what it was using F v Q x 
and H v 


10. Samples of 8 

The links and link frequencies are as follows : 

Link Link frequency 

0 : 0-6927 = A, \ 

P = 2/^16 : 0-4442 = B, [ 

y = 2/^/3 : 0-1018 = C, 

8 = 6/^/7 : 0-04019153(<J — x) 2 = D(8-x) 2 j2„ 

where x = t H . 

Set 

y = x — p, 

z = 8 — x, 

k = y — p= 0-638303, ’ 

A = 8-y = 1-113086., 

The orders of contact (§3) entail the following forms for the three zones: 


( 10 - 1 ) 


( 10 - 2 ) 


Zone 

0- 2/^15 :F(x) = A+a^+a^ + a^, 

2/Vl6-2/V3:G(y) = B + (a 2 A + a 8 ^ + o 4 ^y + (a 2 + a 3 ^ + a 4 ^| 2 +6 8 | 8 + 6 4 |^,.(10-3) 
2/V3-6A/7 :#(*) = I)| 8 + c s ? 8 + c 4 ^. 



88 


Frequency distribution of Jb x 


Five of the seven equations required to determine the a, b and c will be found from the 
order of contact conditions, as follows: 

B* B 3 6* 

(i) B - -4 + a a 2 + “ 3 e + ° 4 24’ 

(ii) C= B + {a^+a^ + a t ^y{a 2 +aJ + a^yi + b 3 £ +6 4 £. 

(m) C = D 2 +c a{ . +r <24 . 

/?2 #3 / /? 2 \ ^2 ^3 

(iv) a 2 /1 + u./ 2 +« 4 ^ +|o« + o 8 ^ + «4 2 )* + ft 3 2 + V fl 

/? 2 /c 2 A 2 

(v) a 2 + (iafi + a A ^ 4-ft 3 /c4-6 4 9 = Z) + c 3 A + r 4 9 . 

The remaining two equations were found by equating the Oth and 2nd moments from the 
curves to the true values 1 and 4/11 respectively. The frequency functions found were as 
follows: 

F(x) = 0*6927 — 0*3201 42a? 2 — 2*7751# 3 + 3-08# 4 , 

0(y) = 0*4442 — 0*8541 lly + 0*308677// 2 + 0*649680y*- 0-553667// 4 , (10*4) 

H(z) - 0*0401 92s 2 4- 0*027763z 3 4- 0*008933z 4 , 
where x — t H . 

For reasons which will be apparent in the next section, it was not deemed necessary to 
apply higher momental checks in this case. 

Reference may here be made to yet another experiment, the negative result of which 
may have some interest. At the n — 6 stage the remarkable ‘regularity’ which the curve 
assumed, after its highly bizarre appearance at the stage before, suggested that orders of 
contact (except at the limit of range) might be ignored at a slightly later stage and a single 
curve fitted using the moments only. 

Using // 0 , /* 2 , /i A and /i % , and the D(S - z) 2 /2 (see (10-1)) for the forms at the limit of range 
with FJ(0) = 0 the following frequency curve was found: 

t\{x) = 0*0401 92(£ - x)* 4- 0* 1 32866(£ - xf - 0* 2937 1 6 (S - x)* 4- 0* 23 1 1 46(<S - x) 5 

- 0*039279(£- x) 6 - 0-005209(tf - x) 7 . (10*5) 

The correct values of the moments (to 6 places) were 

fi 0 = 1, fi 2 = 0*363636, /it = 0*414644, // 6 = 0*763334, /i s = 1*823617. (10*6) 

The value / i { g of the 8th moment computed from the curve was 1*993270, an error therefore 
of +9*3%. The central ordinate F x (0) = 0*9017 as compared with the actual 0*6927, so 
that the curve F^x) could not validly be used for further iteration, since the frequencies near 
the central frequency would be considerably in error. The probability' (computed from 
(10*6)) for F x (x) beyond the ‘true’ 5 % probability point (computed from (10*3)) is 0*0456 
which is quite accurate enough for practical purposes. This concordance, unexpected in view 
of the other facts mentioned, is due principally to the fact that F x (x) has the correct form at 
the limit of range. This experiment shows that, despite the regularity of the <Jb x distribution 
for n = 8, the problem of finding the nearly exact distribution cannot be treated in cavalier 
fashion. 




R. C. Geary 


89 


11. Probability points for frequencies for samples of 8 or more 

By the Gram-Charlier theorem for symmetrical distributions under general conditions any 
frequency f(w ), where w has mean zero and variance unity, can be expanded in the form 

Id ^ 


where 


/(»)-• *p j (itF+sri*^.) 

&(w) = 


V A ' 


8 

8 ! A£ \dw 




&(w), 


(1M) 


<J(2n) 


exp — \ w 2 ^ 


the A being serai-invariants of the original variate. Let u be a normal variate with mean 
zero and variance unity. Using the method of E. A. Cornish & R. A. Fisher (1937) their 
expression for w in terms of u has been extended to the following effect: 


w — u- 




24A1* 8 ' 


*5 


A| _ , . 

384A| 2 ' 5 7 20A | X& + 307 2A§ H52A| Z7 ' 


A 4 A 6 


4032A* 7 


#7 + . 


( 11 * 2 ) 


where the x k are Hermite polynomials in u of the degree indicated. The y and z t terms in 
(11-2) are as follows: 

x 3 — — u z + 3 n y t/ 7 = 9u 7 — 131w 6 + 451 m 3 — 321m, 

y h — 3m 6 — 24m 3 + 29 m, z 1 = u? — 17m 5 + 69m 3 — 57 u, (11*3) 

x 5 = — u 5 + 1 Om 3 — 15 u, x 7 = — u 7 + 21m 6 — 105m 3 + 105 m. 

At (11*2) the expansion is taken to 0(n~ 3 ) because A 2A ./A£ is 0(n^ M ) when the A are 

semi -invariants of h x for samples of n. 

The x k , y i and z f functions at various probability levels are as follows: 


Function 

Probability points 







010 

0-05 

0-025 

0-01 

0-001 

a 

1-281552 

1-644854 

1-959964 

2-326348 

3-090223 


1-739867 

0-484338 

- 1-649229 

- 5-610905 

- 20-239354 

! 

- 297984 

- 22-98240 

- 37-09056 

- 30-28992 

+ 226-9286 


- 1-632248 

7-789154 

16-986942 

22-868797 

- 33-058481 

Vl 

1 136-1309 

194-9563 

- 22-4505 

-675-7597 

- 1286-263 

z i ! 

j 1909291 

41-19959 

27-20947 

! - 53-45639 

! - 261-9424 

•**7 

: 

- 19-5234 

L_ 

- 74-2935 

- 88-4883 

- 15-5752 

i 

362-6625 

i 


(11-5) 


For n = 8 the semi -invariants, etc., required are 

A 2 = 0-353036, A 4 /A| = 0-1357, 

A 4 = 0-017950, A 6 /A 3 = - 1-1612, 

A 6 = - 0-055836, A 8 /A* - 2-6577. | 

A 8 = 0-046470, 

If the formula at (11*2) were quite correct and then if we computed, at any probability 
level e, the value of w , then set # = A| w and from (10-4) computed the probability from end 
of range the result should be exactly e, assuming, of course, that (10-4) gives the exact 



90 


Frequency distribution of Jb x 

frequency distribution. When this procedure is carried out at different pseudo-probability, 
i.e. the probability of x, levels indicated, the following results are found: 

Pseudo -probability 0*10 0*05 0*025 0*01 0*001 \ 

(а) True probability (to 0*090855 0*050459 0*020825 0*01 1 504 0*001 155 V ( 1 1*6) 

(б) Normal probability (to # = a|«) 0 095564 0-052376 0 029502 0-013419 0-001090) 

The correspondence at (a) is obviously satisfactory. At first sight it might appear that at 
0*01 and 0*001 levels the divergence is (by the standards of this communication) rather 
marked. Actually this is not the case considering the fantastic difference in the algebraic form 
of the Gram-Charlier and the actual frequencies near the limit of range. The probabilities 
at (6) show that the normal curve gives quite a good representation. At n = 8, however, the 
comparison flatters the normal curve since, as R. A. Fisher (1930) has shown, the ratio 
A 4 /A| actually assumes its normal value of 3 at n = 7 and reaches its greatest value at n — 22. 

We now propose to take a step which is discussed in some detail in the final section. We 
shall endow the right side of (11*2) with a remainder term which will make the probability 
of w formally the same as the pseudo-probability at (11-6). The following table shows the 
value of the variate t s ar 1 computed from (10*4) (where x represents t s ) at different true 
probability levels, together with the corresponding value of w computed from (11*2): 


Probability 


w 

R 

0*10 

1-263173 

1-273231 

-82-2 

0*05 

1-671682 

1-000548 

21*0 

0*025 

2-043181 

2-008014 

143-3 

001 

2-446126 

2*389594 

227*5 

0*001 

3-107977 

3-079090 

115-8 


It has been seen that the difference between w as given by (11-2) and the true value is 0(n~ 4 ). 
Accordingly the values of JR were found at the different probability levels by setting 

R t « 


w- f— i = 


n % or 

with n = 8. The estimates of the probability points P for values of 8 are accordingly 

R 

Ar~Ar~Ar' at +w as" 

the values of A, B , ..., O and R being given in the following table: 


( 11 * 8 ) 


-A|(. 


A + + G'-JI+D *+ A’$ + G~S+£) , 


a: 


(HU) 


Prob- 

ability 

A 

B 

C 

D 

E 

F 

G 

R 

0-10 

1*281552 

-0*0724945 

0*00776 

0-00227 

0-04431 

-0*01057 

0*000484 

-82*2 

0*05 

1*044854 

-0*0201808 

0*05985 

-00 1082 

0-06340 

-0*03570 

0*001843 

21*0 

0*025 

1*959964 

0*0087179 

0*09659 

-0-02357 

-000731 

-0*02302 

0-002195 

143*3 

0*01 

2*320348 

0*2337877 

0*07888 

-0*03170 

-0*21997 

0*04040 

0*000380 

227*5 

0*001 

3*090223 

0*8433065 

-0*59090 

0*04591 

-0*41871 

0*22738 

-0*008995 

115*8 


The terras in the first four columns agree with, or have been derived from Cornish & Fisher 
(1937). The A’s are semi-invariants derivable from the exact values of the moments given 
at (1-2). 



R. C. Geary 


91 


As a test, the following is a comparison of the 0*05 and 0*01 probability points for n- 25 
as derived by E. S. Pearson (1930) (vising a Type VII curve) with the values from (11*9): 


Probability 

level 

Pearson 

Geary 

(11*0) 

0-05 

0-711 

0-707 

0-01 

1-061 

1-002 



Variate (s.d. = 1) 

Fig. 3. Frequency of <Jb x for n = 3. 


With standard deviation or = 0-435 it is obvious that the differences are not important. 
Sample number 25 is the lowest for which Pearson computed the probability points, and 
for two levels only. The formulae at (11*9) can probably be accepted with confidence. 

12. Conclusion 

Prom frequency formulae (5-2), (6-8), (7-25), (8-8), (9-10) (with (9-6) and (9-9)) and (10*4) 
the probability points for y /b 1 for normal random samples of n = 3, 4, 5, 6, 7 and 8, respec- 
tively, can be determined without difficulty. The six frequency distributions are illustrated 





92 Frequency distribution of Jb y 

in Figs. 3-8. On each of the <Jb x frequency curves there is superimposed the normal frequency 
with the same standard deviation, the intention being to enable a contrast to be made 
between the several Jb x curves by reference each to the normal frequency, and to show the 
fairly rapid approach of the s jb 1 frequency to normality with increasing n, even for small 
samples.* 

In this research nothing was so remarkable as the transformation which the single step 
in the iteration, namely that from n = 5 to n = 6, effected in the shape of the frequency 
curve. From n — 6 on, the join at the links is effected so smoothly as to be almost imper- 
ceptible to the eye. The eye, however, flatters the actual approach to normality in the ^jb A 
frequency curves, as measured algebraically by the probability points. 



Variate (s.n. = 1) 

Fig. 4. Frequency of for n = 4. 


It may be well, at this stage, to recapitulate. Using integral iteration formula (2*13) 
(or (2*14)), frequency ordinates were computed at values of the variate termed the 'links’ 
at which the frequency is shown to have functional discontinuities. Using the exact values 
of the moments (given at (1-2)), and taking into account the known order of contact (§ 3) of 
the different functions at the links and the known form assumed by the frequency at the 

* As R. A. Fisher (1930) has shown, the approach to normality is not, however, uniform with in- 
creasing n, as indicated, say, by fi r See p. 90 above. 



E. C. Geary 


93 


known limit of range, inter-link frequencies were determined in polynomial form. Attention 
is directed to the use, at the n = 4 to 7 (inclusive) stages, of the higher moments for the 
purpose of checking the general reliability of the frequency curve (or rather series of curves 
joined at the links). 

Of far greater practical importance, however, are the formulae (11-0) designed for the 
estimation of the 0*10, 0*05, 0*025, 0*01 and 0*001 probability points for normal random 
samples of yjb x for n ^ 8. There will be little trouble about finding the corresponding formulae 
for other probability links. What degree of confidence can be reposed in these formulae? 
This raises in an acute form the vexed question (on which the protagonists of different schools 
were prone to get very vexed indeed a generation ago) of how best to use moments (or semi- 
invariants) for estimating frequency distributions. The general problem was constantly 



Variate (s.d. = 1) 

Fitf. 5. Frequency of yjb^ for = 5. 


in the writer’s mind during the present research and he would be glad if his colleagues could 
study the possibilities of the methods which culminated in formulae (11*9) for bridging 
the chasm which still divides the knowledge (sometimes exact) of the lower moments of 
statistics like % /fe 1 and b 2 and the formulae (however empirically established) for the frequency, 
in which a measure of confidence can be reposed. This fundamental problem was abandoned 
some years ago in a thoroughly unsatisfactory condition. 

The Karl Pearson approach consists essentially in having regard to the ‘shape’ which 
experience has shown that frequency curves tend to assume and to use the first four moments 



94 


Frequency distribution of Jb x 

for the purpose of determining the constants of the curve. The disadvantage of the Pearson 
method is that of itself it gives no indication as to whether the resulting curve closely follows 
the actual frequency: it is necessary to have recourse to such devices as oomparing the curve 
with a frequency distribution determined from hundreds of random sample computations 
of the statistic under examination. Apart from the tediousness of this method it is often 
indecisive in regard just to the parts of the frequency which are of moBt importance, namely 
the ends, because the small numbers which the check computation throws into these zones 
are usually subject to large (Poisson) errors. 



Variate (s.r>. = 1) 

Fig. 0. Frequency of jb x for n — 0. 


The Gram-Charlier system, on the other hand, can only be used with confidence when the 
frequency is fairly close to the normal. In practice the reliability is judged by the con- 
vergence of such terms as one can compute from the moments, i.e. if the successive terms 
show an ‘ unmistakable ’ tendency to diminish one feels confident in the computed frequency. 

Obviously what both the Pearson, the Gram-Charlier and other frequency systems require 
is a Remainder Theorem. Since, however, an infinite number of moments are required to 
define a frequency distribution, with only a few moments known the most that can be 
expected is that upper (or lower) limits of the probability of the statistic can be established as 



R. C. Geary 


95 


functions of the known moments. This is what Tchebychev’s Theorem, and theorems of the 
type, do. Too much cannot be expected from the knowledge of a few moments: the approxi- 
mations are almost invariably too rough for statistical use, when a high standard of efficiency 
is required; and M. Fr6chet (1937) has shown that the Tchebychev type approximations 
are the best, given the assumptions, which can be made. For all their great mathematical 
importance (incidentally for their justification for the statistician of ‘the faith that is in 
him’), it seems to the writer that research on these lines will not produce formulae which 
will be statistically utilizable in general conditions; but he may be quite wrong. 



Knowing the earlier moments the Cornish -Fisher type expression (depending on the 
Gram-Oharlier form of frequency) gives, at any probability level, an expansion for the 
variate to a defined order in the sample number. As might be surmised from the coefficients 
of the normal moments (e.g. ( 1 -2) above), the coefficients in powers of n _i in the expansion 
of the variate usually tend to increase rapidly. In the present paper a remainder term of 
suitable order in n has been added to the known terms in the former expansion and its 
coefficient found by reference to the (assumed) exactly known expansion for n — 8. Clearly 
two more termB (in »~ 6 and n 6 ) respectively could have been found had we iterated the 
frequency to n = 9 and n = 10, respectively, though this was not deemed necessary in the 



96 


Frequency distribution of Jb x 

present case. It would appear that in problems analogous to the y /b 1 frequency, great preci- 
sion might be obtainable by this method; it is proposed to use it with 6 2 ; and its efficacy 
could be judged by iterating the frequency on a few stages further and testing the formula 
(with remainder) by reference to these iterations which were not used to establish the 
remainder.* 

The iteration method is onerous but, with the co-operative effort to which it lends itself, 
is quite practicable and seems to have a range of application which is not confined to problems 
in which universal normality is assumed. Knowledge of the moments is not necessary for 
its application: in the present research the moments constituted an embarras de choix 



Variate (s.d. = 1) 

Fig. 8. Frequency of ^ for n — 8. 


which necessitated the solution of linear equations in six or seven unknowns. In other work 
it may suffice to compute, from the iteration frequency, many ordinates and simply to rely 
on // 0 , i.e. the total frequency being unity at each stage. This is what the writer did (1935) in 
establishing the frequency of the test of normality a. 

It is clear that the frequency formulae for ^jb x given in this communication cannot be 
regarded as ‘proved ’ in the sense, say, that R. A. Fisher has proved the frequency of normal 
t first given by W. S. Gosset. Even for n = 4 to 8, inclusive, the iteration method yields only 
approximations; the method, however, can be used to attain any desired degree of precision 
* Clearly the effect of the remainder diminishes rapidly with increasing ». 



97 


R. C. Geary 

by increasing the number of frequencies computed by approximate integration at each stage, 
though this is not to be recommended to the solitary researcher. Formulae (11-9) for the 
probability points for n ^ 8 are more empirical and it would be useful to know how they 
fare at, say, n = 12, by comparison with the ‘actual’ frequency found by extension of the 
iteration, should any students be sufficiently interested in making the experiment, the 
results of which incidentally would be a guide in the further use of the method of integral 
iteration here exploited. The present research does not hold out much prospect of exact 
solutions (in the mathematical sense) of the yjb x and 6 2 frequency problems since such 
solutions would involve the exact knowledge of about \n separate sectional frequencies 
in the y ]b 1 case. It does show that empirical solutions can be found accurately enough for 
practical purposes. 


REFERENCES 

Cornish, E. A. & Fisher, R. A. (1937). Revue de VInatitut international de Statistique , 
5 Ann 6 e, 4 Livraison, pp. 309-20. 

Fisher, R. A. (1929). Proc. Lond. Math. Soc. 30, 199-238. 

Fisher, R. A. (1930). Proc. Roy. Soc . A, 130, 16-28. 

Fr^chet, M. (1937). Ueneralites sur les Probability. Variables aUatoires. 

Geary, R. C. (1936). Iiiometrilca , 27, 310-32 and 353-55. 

Geary, R. C. (1936). Biometrika , 28, 295-305. 

McKay, A. T. (1933). Iiiometrilca , 25, 204-10. 

Pearson, E. S. (1930). Biometrika , 22, 239-49. 

Pearson, E. S. (1931). Biometrika , 22, 423-4. 

Pearson, E. S. (1936). Biometrika , 28, 306-7. 

Pepper, J. (1932). Biometrika , 24, 55-64. 


Biometrika 34 


7 



[ 98 ] 


ON THE COMPUTATION OF UNIVERSAL MOMENTS OF TESTS OF 
STATISTICAL NORMALITY DERIVED FROM SAMPLES DRAWN 
AT RANDOM FROM A NORMAL UNIVERSE. APPLICATION TO 
THE CALCULATION OF THE SEVENTH MOMENT OF b 2 

By R. C. GEARY and J. P. G. WORLLEDGE 

1. Introductory 

The principal object of this communication is to develop a computational technique appro- 
priate to the formula given by one of the authors (Geary, 1933). By way of illustration the 
formula is applied to the computation of the seventh moment of 

= 3 = n S (x i -x)*l{2(x i -xf}* > (M) 

where x v x 2 , ... x n are the measures of the random sample of n and of which .r is the arith- 
metic mean. Universal normality is assumed throughout. 

A glance at formula (3-9) in which this paper culminates will indicate that the task of 
deriving higher normal moments of b 2 is not one to be undertaken in a frivolous spirit. The 
work finds its main justification in the conviction of the authors that accurate (if not exact) 
values of the probability points of b 2 can be found in terms of the moments of b 2 for all values 
of n using a method which has proved successful in the case of the analogous test of asym- 
metry, involving 7n 

\l f h = ^ = n*l'(x i -z) 2 3 l{l\x i -x) i }l ( 1 - 2 ) 

In turn, the importance of the determination of accurate probabilities for y]b x and b 2 for 
normal samples derives from the facts revealed by unpublished work by one of the authors. 
This shows (1) that probabilistic inferences drawn from the well-known significance tests 
based on the assumption of universal normality are apt to go astray when, in fact, the universe 
is not normal, and (2) that <Jb x and b 2 provide the most efficient tests of asymmetry and 
kurtosis, respectively, in indefinitely large samples, amongst wide fields of alternative tests 
and of alternative non-normal universes. 

R. A. Fisher (1930) has given the exact values of the second, fourth and sixth moments of 
<Jb x and J. Pepper (1932) the eighth moment. In the former paper R. A. Fisher also gave the 
values of the second and third moments of b 2 . The moment field was extended by J. Wishart 
(1930) and in a joint paper by R. A. Fisher & J. Wishart (1931). C. T. Hsu & D. N. Lawley 
(1940) gave the fifth and sixth moments of fe 2 * AH these authors used the combinatorial 
method due to R. A. Fisher (1929). The present approach is entirely different. 

2. The fundamental relation 

To make the expose complete it may be useful to reproduce the relevant part (which is quite 
brief) of the 1933 paper. The method used is due essentially to C. C. Craig (1928), applied to 
the normal case. Using, in the usual notation, a prefixed E to indicate ‘expected’ or, more 



( 2 - 1 ) 

(2-2) 


R. C. Geaby and J. P. G. Woblledqe 9& 

accurately, ‘average value for all samples we have for the characteristic function of the 

z i — x i —x (i = 1,2, 

the expression E exp 1 2 z ( j , 

where the t t are n parameters, so that 

2?z}‘ zg»...zjp, 

where the a t are any p positive integers, is the coefficient of 

$«...*£>> 

in the expansion of 

a x \a 2 \ ...a p \ E exp {f 1 (a: 1 -x) + t 2 (x 2 -x) + ... + t n (x n -x)}. 

The exponent can be written in the form 

*1(^1 — t ) + *2(^2 — 0 + • • • + *n(^» ~ * )> 

n 

where f = 2 h! n - Since the x < are independent. 


(2-3) 


n 

EexpEx^—i) = n Eexpx^ — t). 


i-i 


(2-4) 


Assuming, as we may without loss of generality, that the normal universe of the #,• has 
mean zero and unit standard deviation, we have 

Eexpx^ — i) = exp 


Hence a,!®,! ... 0 ^! A’expSM** - *) = ...a p !exp f) 8 . (2-6) 

1^1 

By definition, the power f of a term is given by / = ]£ a i and the dimension by p. It is clear 

i 

that the required universal mean value of 

E(x x — x ) a * . . . (x p — x) a p 
will be found as the coefficient of 77/J* in the expansion of 

“ l ! “ i ! ! 2 ^ pl { <§+■ ••+*£- (<1 + /2 V ' ' +Jp)2 \ k . ( 2 - 6 ) 

where 2 k = /. 


3. The computational scheme 

The computational scheme, which is quite general, will most clearly be outlined by reference 
to the computation of the exact value of a specific moment (from origin) f^(m A ), for the deriva- 
tion of which it was primarily designed. Then 

/ 7 (m 4 ) = ± E(zt + 4 + . . . + z* y = 1 [*K • 28 • ) + n(» - 1 ) ( ^ E( • 24 • 4) 

+ sT^i E( ' 20 ' 8) + 4TJ! E( ' 16 ' 12 ' ) ) + n{n ~ 1 ] (n ~ 2) [siT^T E{ ■ 20 ‘ 4 a) 

+ 4 ! I (T! ’ 1 6 • 84 ) + jj- j2 2ll! ^ • 1 22 • 4 ) + 3 r|^2! ‘ 1 2 ’ 82 >} 

7-2 



100 Computation, of universal moments of tests of statistical normality 

+ n(»-l)(»-2)(n-S)(»-4){ 3 - 1 -^i ¥I ^(.12.4*) + 2!t2 -Ji ii3! ^(8V)} 

+ »(» - 1) (n - 2) (n - 3) (n - 4) (« - 6) i?(84*) 

+ n(»-l)(»-2)(n-3)(n-4)(n-5)(»-6) i -pL | ^(4»)], , (3-1) 

where, for example, 

!£(• 12 • 84 2 ) = Ez 12 £ 2 z\z\ = — 5r) 12 (^: 2 — ir) 8 (o? 8 — ^) 4 (o; 4 — 5r) 4 . 

There are, accordingly, fifteen terms made up of one of dimension one, three of dimension 
two, four of dimension three, etc. The structure of the numerical coefficients will be noted: 
in particular that, when the power of a factorial appears in the denominator, its factorial 
also appears. Each of the fifteen E terms will be evaluated separately, grouped by dimensions 
and multiplied by the w-factors. 

As already stated, the value of E(a 1 a 2 a s ...) will be found as the coefficient of 


t%* ... 

in the expansion of 0,1 “ 1^1 + K^*) 2 } 14 * (3*2) 

with v = — 1 jn. In this case, of course, / = Ea i = 28. 

Expand (3-2) in powers of v by the binomial theorem. Each of the v power terms will, in 
general, make a numerical contribution to the value of E(a x a 2 ...) which will, accordingly, 
be represented by a polynomial in v of degree 14. The term in v* will be 


a x ! a * ! . . 

u\¥* 


14! v* 


(14-*)! (2*)! £ 

H 


1 

(a 1 -2* 1 )!*,!(a a -2*,)!* a ! ..." 


(3*3) 


In the Z 8f summation extends to all non-negative integer series s v s 2 , . . . , so that Ss t = ( 1 4 - $), 
8 X being associated with a v 8 2 with a 2 , etc. The values which the 8 t can assume are obviously 
restricted further by the condition that 


a* ^ 2 8 { . 

Let the series (2T 0 , a. a.- . ) be termed the reciprocal factorial vector (hereafter usually written 
‘r.f.v.’) of a x ,a 2 , ..., the terms of the vector being regarded as of the order indicated by the 
subscript. The vector will be indicated by clarendon type. From the computational point of 
view the following relation is fundamental : 

A x B = AB, (3-4) 

where A = (a x a 2 ...) and B = (b x b 2 ...), and any other r.f.v. The multiplication sign at (3*4) 
is defined as follows: the terms of A are multiplied respectively by y°, iA, v 2 , etc., and added to 
give a scalar A ; the terms of B in the reverse order are also multiplied respectively by 
v°, v l 9 v 2 , ... and summed to give J5. The coefficients of y°, v l , v 2 y ... in the product (in the 
ordinary sense) A B give the vector AB. Relation (3-4) is immediately evident from the form 
of E 8 in (3-3). From this relation it is quite easy to build up r.f.v. ’s from those of lower order 
44 from 4 ; 84 from 8 and 4, 88444 from 8 and 8444, or 88 and 444 ? etc. 



R. C. Geary and J. P. G. Worlledge 


101 


Having found all fifteen r.f.v.’s the second step in the computational process is to form the 
scalar product of each r.f. v. and (2#) ! v*/s ! — the latter will be termed the v-multipliers which, 
it is important to note, are the same for all the terms in (3*1) — which, from (3*3), gives 
E{a x a 2 ...) divided by 

a x \a 2 \ .../2 14 . 

The latter are multiplied by the numerical factors in (3*1) to give what are termed the 
constant multipliers , ‘constant’ in the sense that they are the same for all the v power terms 
in each of the E's in (3*1), but these constant terms are different for the different Iff’s. For 
example, the constant multiplier for the term £7z\ 2 z§zfz| is 

12f 8’ 4t‘ 2 7t 

3! 2! 1 ! 2 2! 2 14 v ' 

Note the 4 absolute constants ’ 7 ! and 2 14 , and that the powers of the term appear as factorials 
in the numerator and factorials one-fourth of these powers in the denominator. In the 
denominator is also a 2 ! which is the factorial of the factorial power. 

The third step in the computation is to sum the terms of the same dimensions. The final 
step consists in the multiplication of the terms of the different dimensions by the v-factors 
as follows: 

Table 1 . v-f actors 


Dime vision 

^-factors 

1 

r 8 

2 

-(r® + i' 5 ) 

3 

2v« + 3P 5 4* r 4 

4 

— (6*' 6 + 1 1 p 5 + 6r 4 + *' 3 ) 

5 

24y fi + 50y B + 35^ 4- lOi* 4- 1' 2 

6 

j 

- ( 1 20r® 4- 274r 6 4- 225r® + 4- 1 5r* 4- v) 

7 j 

i 

l 

720r 8 4- 1 764y 6 4- 1 624r* 4- 735r* 4- 1 75r* 4-21^4-1 


The ^-factors at (3*6) are, of course, the ^.-factors in (3*1) with v — — 1/w. 

To deal with the very large whole numbers and their reciprocals which arise in factorial 
computation we had recourse to a prime number index notation. For this purpose the number 
is factorized into powers of the lower primes — we have used the notation for primes not 
exceeding 31. Thus 

746, 137,1 99,808,000 = 6847 • 1 3 1 • 1 1 1 • 7 2 • 5 3 • 3 5 • 2 9 
is written in this notation 6847f 1 12359], 

the digits in the square brackets f ] being the powers of the lowest primes arranged in ascending 
order from the right. The ordinary number 6847 will be known as the coefficient and the 
symbolical number in square brackets as the primal of the original number. Note that in 
this example the notation affects an economy from 15 to 10 in the number of digits 
required to describe the number. Should the original number not be factorizable by a 
particular small prime a 0 will be inserted in the proper place, e.g. [10358] means that 7 is 
not a factor of the number represented. If, as often happens with the first two primes, the 
indices exceed 9, decimal points are used, e.g. [124*11*17] means that the original number 
has 2 17 and 3 U as factors. The primal notation can be used when the indices are all positive 



102 Computation of universal moments of tests of statistical normality 

or all negative: occasionally, however, + and - signs have to be mixed in the primal (see 
Table 0). 

With little practice great facility is acquired in applying the ordinary rules to numbers 
in primal notation. For multiplication or division corresponding digits in the primals are 
added or subtracted, the coefficients being dealt with in the ordinary way. In addition or 
subtraction common factors in the primals are immediately evident and the coefficient of 
the sum (or difference) is derived usually by a single product-sum (or product-difference) 
operation on a multiplying machine. It may be observed that all the work for this paper 
was executed without inconvenience on small hand multiplying machines with capacity 
9x8x 13. 

In the following tables the first thirty-two factorials, the ^-multipliers and the constant 
multipliers required for the computation of /^(ra 4 ) are expressed in primal notation. 


Table 2. Factorials in primal notation 


0! = 1! 

= [01 

2! = 

[1] 

3! = 

111] 

4! = 

[13] 

6! = 

[113] 

6! = 

1124] 

7! = 

[1124] 

8! = 

[1127] 

9! = 

[1147] 

10! = 

[1248] 

11! = 

[11248] 

12! = 

|112f>-10] 

13! = 

[11126-10] 

14! = 

[11226-11] 

15! = 

[11236-11] 

16! = 

[11236-15] 


1 111236- 15] 
[111238-16] 
[1111238*16] 
[1111248*18] 
[1111349*18] 
[1112349*19] 
[11112349*19] 
[1111234*10*22] 
[1111236*10*22] 
[1112236*10*23] 
[1112236* 13*23] 
28! = (1112246 13*25] 

29!= [11112246*13*25] 
30!= [11112247 • 14*26] 
31! = [111112247*14*26] 
32! = [111 112247* 14-311 


17! = 
18! = 
19! = 
20 ! = 
21 ! = 
22 ! = 
23! = 
24! = 
25! = 
26 ? = 
27! = 


Table 3. v-Mvltipliers in factorial and primal notation 
Term in Coefficient 


v° 

v 1 

v* 

v* 

v 7 
v» 
V* 
plO 

p u 

pl2 

pl» 

«*4 


1! 

© 

© 

[0] 

2! 1 !~ l = 

[11 

4! 2!~ l = 

[12] 

6! 3!- 1 = 

[113] 

8! 4!~ l — 

rni4] 

10! 5!- 1 = 

[1135] 

12! e!-^ 

[11136] 

II 

7. 

«> 

[111137] 

16! 8!-‘ = 

[111248] 

18! 9! _1 = 

(1111249] 

20! 10! _1 = 

[1111124*10] 

22! 11 ! _l = 

[1111225*11] 

24! 12 ! — 1 = 

[11111225*12] 

26! 13! _1 = 

[11111245*13] 

28! 14 !~ 1 = 

[11111248*14] 



R. C. Geary and J. P. G. Worlledge 


103 


Table 4. Constant multipliers in factorial and primal notation 

Required for 
computation of the 
undermentioned 
term in (3-1) 

£7(28) : 28! 7 !/7 ! 2 14 = [1112246-13-11] ■ 

£7(24-4) : 24! 4! 7 !/6! 1 ! 2 14 =[1111244-11-11] 

£7(20-8) : 20! 8! 7!/5! 2! 2 14 = [111145-11-11] 

£7(10-12) : 10! 12! 7 !/4! 3! 2 14 = [1246-11-11] 

£7(20-4*) : 20!4!*7!/5! 1 !*2!2 14 = [111134-11-10] 

£7(10-84) : 10! 8 ! 4! 7 !/4! 2! 1 ! 2 14 = [1145-10-11] 

£7(12*4) : 12!*4!7!/3!*2!1!2 14 = [235-11-10] 

£7(12-8*) : 12 ! 8 !* 7 !/3 ! 2 !* 2 ! 2 14 = [145-10-10] 

£7(10-4*) : 16 ! 4 !* 7 !/4 ! 1 !* 3 ! 2 14 = [1134-9-10] 

£7(12-84*) : 1 2 ! 8 ! 4 !* 7 !/3 ! 2 ! 1 !* 2 ! 2 U = [134-10-10] 

£7(8* 4) : 8 !* 4 ! 7 !/2 !* 3 ! 1 ! 2 14 = [44-8*10] 

£;(12-4 4 ) : 1 2 ! 4 ! 4 7 !/3 ! 1 ! 4 4 ! 2 14 = [12398] 

£/(8 2 4 3 ) : 8 !* 4 !* 7 !/2 !* 2 ! 1 !* 3 ! 2 14 = [3389] 

£7(84 6 ) : 8 ! 4 ! fi 7 !/2 ! 1 ! 6 5 ! 2 14 = |2188] 

£7(4 7 ) : 4! 7 7 !/l ! 7 7 ! 2 14 = [77] 

The theory will be illustrated by reference to the computation of Ez\d$fe\z\ — E( 8 2 4 3 ). 
First the r.f.v. 88444 is found as the product 884 x 44 by setting down in equal spaces the 
terms of 884 and on a movable slip spaced to the former the terms of 44 in reverse: 

All primate are negative 



The term in 88444 from the position illustrated is that of the 5th order, namely, 

5[4- 13] -f 109[4- 12] +111* 7[14- 10] + 803[1 12- 1 1] -f 1493[1 14-12] 

= [114* 13] (5-7-54* 109- 7 -5- 2 + 777-7-8+ 803-9-4+ 1493-2) 

= [114- 13] (3-27737) = 27737[1 13- 13]. 

The manner of computation is indicated: first the largest (negative) digits in each of the four 
positions of the primals are underlined and the underlined set is regarded as the common 
factor. Note how, at the final stage, the factor 3 of the coefficient reduces the primal digit 
from 4 to 3. From the entries in the round brackets ( ) it will be clear that, as stated above, 
the procedure is well adapted to the multiplying machine. The full calculation of 88444 is 
shown in Table 5. 

The identity of the r.f.v. ’s from the two factorizations of 88444 constitutes an absolute 
check of the work. The calculation of £7(8 2 4 3 ) required for (3-1) is completed in Table 6. 
In practice the figures in columns (4) and (5) of this table were derived from those in column 
(3), and in Tables 4 and 5 by entering the latter on two movable slips and folding opposite 
each entry, as required. This stage of the work was rapidly executed. The sum -product of 
columns (1 ), (2) and (5) give the value of £7(8 a 4 3 ). All the r.f.v. ’s required for the calculation 
of the JSTs for (3-1) are given in the appendix. 



104 


Computation of universal moments of tests of statistical normality 


Order 

0 : 
1 2 
2 : 

3 : 

4 : 

5 : 

6 : 

7 : 

8 : 
9 : 

10 : 
11 : 
12 : 

13 : 

14 : 

0 : 
1 : 
2 : 

3 : 

4 : 

5 : 

6 : 

7 : 

8 : 
9 : 

10 : 
11 : 
12 : 

13 : 

14 : 


Table 5. Calculation of reciprocal factorial vector 88444 


All primals are negative 


(i) By 884 x 44 

[29] 

[2-8] + 5[29] 

7[3- 10] + 5[28] + 109[3- 11] 

[3 • 10] + 35[3 • 10] + 109[3 • 10] 4- 1 1 If 139 ) 

[4- 13] + 5[3 * 10] + 763[4- 12] + 111[138| 4- 803[1 12- 12] 

5[4- 13] + 109[4- 12] 4- 777f 1 4 • 10] + 803[112- 11]+ 1493| 114*12] 
lO9[5-15]+lll[i41O] + 502][113-13] + 1493[114- 11 ] 4- 389(122- 14] 
111[15- 13] + 803[113* 13] 4- 1493[15- 13] 4- 389[122- 13]-f 1 19[124- 13] 
803[114- 16]+ 1493[] 15* 13] 4- 389[23* 15] 4* 119[ 124- 12J 4- 1543[225* 37] 
1493[1 16 • 16] + 389[123- 15] 4- 119[25- 14] 4- 1543[225- 16] 4* 31[225* 17] 
389[124- 18] 4* 119[125- 14] 4- 1543[120- 18] + 31 [225- 16] 4- [225* 19] 
119[126- 17] 4- 1543[226- 18] + 217[226- 18] + [225- 18] 

1543[227 • 21] 4- 31[226 • 18] + [126 • 20] 

31[227 * 21] + [220 • 20] 

[227 • 23] 


r.f.v. of 88444 

[29] 

7129] 

9 [* 11 ] 

947[13- 10] 
181 1[ 110- 13] 
27737[113- 13] 
1783141| 125- 15] 
20627[115- 13] 
1772417[225-17] 
547889(226- 17] 
151 331 [226 * 19] 
127[223- 18] 
2329[227 * 21] 
37[227 • 21] 
[227 • 23] 


(ii) By 8844 x 4 

[29] 

[29] + [18] 

[3* 11] 4- [18] 4- 85[8* 10] 

[2 • 10] 4- 85[3 • 10] 4- 1 69[ 1 2 • 1 0] 

85[4 • 1 2] + 1 69[ 1 2 • 1 0] 4- 1 111 3[ 1 04 • 1 3] 

169[13'12] + 11U3[104'13] + 5137[1 14-11] 
11113[105- 15] 4- 5137(114- 1114- 22703[124 • 13] 
5137[116- 13] + 22703(124- 13] + 9341 [125* 13] 
22703[125 • 15] + 9341 [ 1 25 • 1 3] + 90541 [225 17] 
9341[126-15] + 90541[225- 17] + 2453(225- 10] 
90541[226 • 19] + 2453[225 • 16] + 1 37( 1 26 • 1 8 1 
2463(226- 18] 4- 137(126- 18]+ 17[226- 18] 

137[ 127 • 20] + 17[220 • 18] + [226 • 2 1 J 
17[227 • 20] + [226 • 21] 

[227-23] 


- 129] 

= 7[29| 

= 0 [ 11 ] 
= 947(13- 10] 

= 1811(110-13] 

- 27737[1 13-13] 

= 1783141[125- 15] 
= 206271 115-13] 

1772417[225-17] 
= 547889[226- 17 ) 
= 151331(220-19] 
= 127[223* 18] 

= 2329(227-21] 

= 37(227-21] 

= [227-23] 


Finally, the E '\ s are multiplied by the appropriate j'-factors given in Table 1, to give the 
value of E(ml). Now R. A. Fisher (1930) (see also Geary, 1933) has shown that 

th(b 2 ) = E(bl) = E(ml)!E(m?) (3-7) 

and E(m\*) = (n - 1 ) (n + 1 ) (» + 3) . . . (n + 23) (n + 25 )/n u . (3-8) 

Finally, /4(6 2 ) = (3 7 w 13 + 21 l-3 7 n 12 + 64,802- 3«n 12 + 13,1 54,290 •W 

+ 668,584,331 • 3 5 n 9 + 25,489,306,481 • 3 V + 74,020,784,452 • 7 • 3 5 » 7 

- 72,634,85 1 , 1 24 • 7 • 5 • 3 s ra« + 407,08 1 ,273,655 • 7 • 5 • 3 6 n B 

- 1,287,510,783,723 • 7 • 5 • 3 6 n* + 2,526,463,322,982 • 7 • 5 • 3«» 8 

- 280,521 ,238, 122 • 1 1 • 7 • 5 • 3 6 n 2 + 3,036,544,767 • 1 3 • 1 1 • 7 • 5 2 • 3«ra 

- 135,393,525 • 13 • 1 1 • 7 2 5 2 3 •)/(» + 1 ) (n + 3) . . . (n + 23) (n + 25). (3-9) 



106 


R. C. Geary and J. P. G. Worlledge 

4. Corroboration of formulae 

An integral part of the present work is the technique of check. To be of value the formulae at 
(3*9) and (3-10) must be absolutely correct because ( 1 ) any errors made in factorial work are 
fairly certain to be large and (2) the formulae are designed for use when n is small, when 
relatively small errors in the numerical coefficients may materially affect the results. Further- 
more, it is almost impossible to avoid error (even in a joint work like the present) with so 
many individual calculations involving numbers astronomically large. As will appear, there 
is a satisfactory, though not absolute, check at the final stage; but if it reveals error it does 
not show where the error occurred, so that, if this were the sole check, there would be no 


Table 6. Calculation of E(HH 3 ) from 88444 


form 

(1) 

r.f.w 

88444 

(3) x p-multiplior 
(Table 3) 

(4) 

(4) x constant 
multiplier [3389] 

(5) 

Coefficient 

(2). 

Primal (nef?.) 

(3) 

v° 

1 

T29] 

r-2-9] 

[3360] 

V 1 

7 

r 29] 

[-2-8] 

[3361] 

I' 2 

ft 

[11] 

[1-9] 

[3390] 

r 8 

ft47 

[13-10] 

1-2-7] 

[3362] 


1811 

[11013] 

n- 9 ] 

[3390] 

i* 

27737 

[113-13] 

r-si 

[3381] 

i' fl 

1783141 

(125- 151 

[ 10 — 1 — 2—9] 

(13260] 

V 7 

20027 

(115-13] 

11100-2-6] ! 

[113363] 

i ,H 

1772417 

(225- 17] 

(11 — 10—1 — 9] 

[112370] 

i' u 

54788ft 

1226-17] 

[111 — 10 — 2—8] 

[1112361] 

J,10 

151331 

(226-19) 

(1111-10-2-9] 

[11112360] 

v" 

127 

(223-18) 

[1111002-71 i 

[111133-10-2] 

V 12 

232ft 

[227-21 | 

[1111100-2-91 

[111113360] 

j;13 

37 

[227-21| 

[1111102-2-81 

(111113561) 

j»14 

i i 

l 

[227-23] 

i 

[11111021-9) 

[111113590] 


alternative but to face the tedium of complete recalculation. It is essential to devise an 
absolute chock at each stage. This has been done for the present technique. 

The first check is the n = 1 (or v — — 1 ) check. This is applicable to the E ' s (see (3-1)) of 
dimension one, two and three. It derives from the fact that 

\ (A-) 2 } 1 * = j(l + v) 2 S M,) U • G-i ) 

It will be immediately evident from the latter that when v = — 1 the following terms vanish 
identically: 

(i) all terms of one dimension ; 

(ii) all terms of two dimensions except those of the type t\ A t\* with which w r e are not 
concerned; 

(iii) all terms of three or four dimensions in which the highest power exceeds 14: the latter 
being the highest power which, say, t x can assume in the expansion of 

2 U (£2U,) 14 >- U . 



106 Computation of universal momenta of tests of statistical normality 

Even in terms of three dimensions in which the highest power is less than 14, e.g. in Ez\ 2 z^ 2 2 % 9 
the v = — 1 test can be exploited. In fact, from (2*5) and (4*1) the required terms for 
v = — l are 


2?(12*-4) 


14! 12 ! 2 4! 7! 

10! 2 ! 2 3! 2 2!14!2 14 


2 14 = 12![11124], 


14! 12!8! 2 7! 

JS?(12-8 2 ) = “0J22 j 3 1~2!* 2 ! 1 4 ! 2 14 


2 14 = 12 ![31 15]. 


The sum of these two terms is 131 [1 236 • 14] which should be the sum of the four E terms of 
dimension three in (3*1) since U(20-4 2 ) and JS7(16- 84) are zero (for v = - 1). The checks 
specified in this paragraph were fully applied to the terms of one, two and three dimensions 
in (3*1) before multiplication by the ^-factors (Table 1). 

Reciprocal factorial vectors for dimensions exceeding two were checked fully by the 
‘double’ factorization technique exemplified in Table 5. In view of the simplicity of the 
two subsequent processes, namely those of the v- and constant multipliers, this check may 
be taken as establishing the accuracy of the E' s of dimension three or more. Reference may 
nevertheless be made to a check at this stage, namely that the ratios of consecutive coeffi- 
cients in each E exhibit a marked regularity, if correct. Any irregularity (which in the nature 
of the work will usually be large) must be suspect. 

Assuming the accuracy of the E 1 s in (3-1) the final stage was checked by multiplying by 
the ^-factors (3-6) in two ways: 

(i) by straight multiplication using the primal notation; 

(ii) by taking (in (3-1)) 

= (H.v)(l + 2v)...(l+6i;)4i + Kl+v)...(J+riiO-4 a +... + ^ 7 

= (1 -f v) ... (1 + 5j>){(1 + 6v)A 1 + vA 2 }+ .... 
and computing in successive stages 

B 2 = (1 + 6 v)A x 4- vA 29 B 3 -(\+ 5 v) B 2 -f m 2 A 3 , etc. 

The results were the same. 

A satisfactory check for the final stage is that of n = 4. A. T. McKay (1933) has, in fact, 
given a formula for this value of n from which the seventh moment from zero of b 2 is found 
to be 

82,220,810,251 /5 2 1 1 • 13* 17-23-29, 

which value also transpired on substituting 4 for n in (3-9). This establishes the accuracy 
of all the formula except possibly the part accruing from the terms in (3-1) in 

n(n— 1) ... (n — 4), n(n— 1) ... (w — 5) and n(n— 1) ... (n — 6) 


which vanish when n = 4. 

If it adds nothing to the check in the previous paragraph it is nevertheless of interest to 
observe that, for n = 3, the value of the seventh moment of b 2 is found to be (|) 7 which is as 



R. C. Geaby and J. P. G. Worlledge 107 

it should be since, in this case, each 6 a assumes the constant value f , whether the samples 
are normal or not, 

A partial check is also afforded at the final stage by the vanishing of all coefficients of 
powers of v from v u to v*° inclusive. 

5. Conclusion 

Previous investigators in this field have all used the combinatorial technique, invented by 
R. A. Fisher (1929) and applied in the first instance to the cumulants, which are linear 
functions of the sample moments. The present writers have not had sufficient experience in 
working the Fisher technique to decide which method is easier to apply. It is quite likely 
that the Fisher method is shorter. A strong point of the present computational scheme is 
that it lends itself to check at every stage; and the method may appeal to students who 
prefer the algebraical or arithmetical to the geometrical approach. For their benefit, and 
also in case it may later be found necessary (in connexion with the accurate determination 
of the probability points of 6 a for samples of all sizes) to compute higher moments than the 
seventh — it is almost certain the seventh will be required — we give as an appendix an 
extended series of reciprocal factorial vectors. From these can be derived without difficulty 
(i) corresponding E' s, e.g. E(z x —x) H (x 2 — z) A (x s — x) x , on multiplication by appropriate v- 
and constant multipliers and (ii) r.f.v.’s of higher powers. 


APPENDIX 

A selection of reciprocal factorial vectors required for the calculation of moments ofb 2 for normal 
sample#, including all used for the calculation of the seventh moment , in primal notation 

All primals are negative 


Order 

4 

8 

4* 

12 

84 

0 

[1] 

[13] 

[2] 

[124] 

[14] 

1 

[1] 

[12] 

[1] 

[114] 

L4] 

2 

[13] 

[14] 

7| 13] 

[26] 

31[26] 

3 


[124] 

[131 

[136] 

71115 ] 

4 


[1127] 

|2«] 

[1128] 

127[1 128] 

5 




[1148] 

1 7[ 1 138] 

6 




[1126- 10] 

[113- 10] 



4 * 

16 

12 4 

8 2 

84 2 

0 

[3] 

[1127J 

[125] 

[26] 

[15] 

1 

3[3] 

[1125] 

[123] 

[24] 

[13] 

2 

13[5] 

1137] 

13[135] 

5[26] 

17[25] 

3 

3[4] 

[237] 

[35] 

31[136] 

53[125] 

4 

13[17] 

[113-10] 

47[1138] 

323[1139] 

2497[1138] 

5 

ri7i 

[1259] 

29[1247] 

[1028] 

173(1137] 

6 

[391 

(1125-11] 

157[ 11259] 

43[124- 10] 

7f 139] 

7 


[11225-11] 

[11249] 

[124-10] 

[1049] 

8 


[11236-16] 

[1126-13] 

[224-14] 

[114-13] 



108 Computation of universal moments of tests of statistical normality 


Order 


20 


16 4 


12 8 


12 4 * 


0 

[4] 

LI 248] 

[1128] 

[137] 

[126] 

1 

|2| 

[1148] 

[1028] 

[37] 

[20] 

2 

1 9| 1 4 1 

[113*10] 

1 1[ 1 3 • 10] 

31[139] 

101[138] 

3 

r,[4j 

[124*8] 

47[1238] 

7[227] 

19[136] 

4 

49|17| 

[124*11| 

253[124* 11] 

299[123* 10] 

1 1 03[ 1 1 49] 

5 

5| 16] 

[136*11] 

89[125* 11] 

73[115* 10] 

53[239] 

6 

19138] 

[1126*13] 

51[11 15* 13] 

3713[1126* 12] 

629[1 105* 11] 

7 

|38| 

[11226*12] 

1277[1 1226* 12] 

83[1 126 • 11] 

479[1 126* 10] 

8 

|4*121 

[11236*16] 

229[1 1234* 10] 

1181[1230* 16] 

773[1 120* 14] 

9 


[111238*16] 

[11135*16] 

47(1237*15] 

13[1126* 14] 

10 


[1111248*18] 

[11237*18] 

11237*17] 

[1127*10] 


8 2 4 

84 * 

4 6 

24 

20 4 

0 

[27] 

[16] 

[6] 

[1125*10] 

[1249] 

1 

5[27] 

5[16] 

5[5] 

[11249] 

[1238] 

2 

J 09(39] 

i3r«] 

125[17] 

[125*11] 

53[125* 10] 

3 

111(137] 

143(126] 

35[1 5] 

[126*11] 

31 [125* 10) 

4 

803[112* 10] 

1399[1 119] 

545[28] 

[224*14] 

23[124* 13] 

6 

1493[1 14* 10] 

239[1029 | 

23[8] 

[236*12] 

13[135* 1 1] 

6 

389[122* 12] 

6943[1 14* 1 1 J 

545(3*10] 

[ 1137 14] 

151[1 1 36* 13] 

7 

119[124* 1 1] 

66[104 * 10] 

35(39] 

[11236*14] 

733[11236*13] 

8 

1 643[225* 16] 

277[1 14* 14] 

125(4*13] 

(11237*18] 

1153| 11237* 17J 

9 

31 [225 *151 

23(115*14] 

5(4*13] 

[111239*17] 

587[1 1 1 238* 1 0] 

10 

[226*17] 

[115-16J 

[6*15] 

(1111248*19] 

67(1 101247* 18] 

11 




(1112349* 19J 

[1111049*18] 

12 




[1111234*10*22] 

[1111249*21] 


16 8 

28 

122 

8 3 

8 2 4 2 

0 

[113*10] 

[11225*11] 

[248] 

[39] 

[28] 

1 

[1129] 

[11125*11] 

[237] 

[28] 

[17] 

2 

13ri0411] 

[1126*13] 

23[249] 

[•10*] 

85[39] 

3 

43[114* 1 1] 

(1130*12] 

47[259] 

47[12*10] 

169[129] 

4 

3823[224* 14] 

[230*15] 

289(124*12] 

89[101 *13] 

11113[104* 12] 

5 

1 607 [226* 12 1 

[238*151 

593(130* 10] 

28 If 113*11] 

51 37f 1 14* 10] 

6 

4933[1136* 14] 

[123717J 

8531[1137 • 12] 

2833[124*13] 

22703[124* 12] 

7 

28943[1 1236* 14] 

[11337* 15] 

193[1136* 12] 

11[121*13]‘ 

9341(125* 12] 

8 

4331[1237* 18] 

[11248*19] 

929[1230* 10] 

2213[224*17] 

90541|225*16] 

9 

79[1 1236* 17] 

[111249* 19] 

1 13(1238* 15] 

2089[236*16] 

2453[225* 15] 

10 

43[11237* 19] 

[1111249*21] 

37[1248 • 1 7] 

71[235*18] 

1 37[ l 20 • 17] 

1] 

37[ 1 1 348 • 19] 

[111234*10*20] 

[1249*17] 

(235*18] 

17[226* 17] 

12 

[11348*22] [1111234* 10-23] 

[224*10*20] 

[336*21] 

[226*20] 


13 

14 


[1112236* 10*23] 
[1112246* 13-26] 



R. C. Geary and J. P. G. Worlledge 109 


Order 

24*4 

20-8 

20 - 4 * 

16-12 

0 

[1125* 11] 

[125-11] 

[124-10] 

[124-11] 

1 

[1025-11] 

[25-11] 

[24-10] 

(24*11] 

2 

139(1126- 13] 

19(124-13] 

179(125-12] 

187[12&- 13] 

3 

47(1120- 12] 

331(136-12] 

87(125-11] 

379(135-12] 

4 

79[220- 15] 

7001(236-15] 

151(26-14] 

2819(234-15] 

5 

31[137- 15] 

1489(230-15] 

1501(136-14] 

17161(237-15] 

6 

1051[1237- 17] 

25513(1237-17] 

1609(1036-16] 

104507(1237-17] 

7 

79[11236* 15] 

197(11127-15] 

27487(11236-14] 

30317(11237-15] 

8 

73[ 1 1 138 * 19] 

30211(11247-19] 

13043(11137-18] 

431099(11248-19] 

9 

59(101239 * 19] 

108713(11 1249- 19] 

209509(111238-18] 

17177(11248-19] 

10 

5011[J111249*21] 

310841(1111249*21] 

2197(1101244-201 

5513(11249* 21] 

11 

8011(111234-10-20] 

127(1111336-20] 

63053(1 11 1249- 19] 

1019L1 134- 10-20] 

12 

1597(1111224-10-23] 

4013(111134-10-23] 

1873(1111248-22] 

991(1234-10-23] 

13 

47(1111234- 10-23] 

109(111135-10-23] 

101(111124-10-22] 

31(1235- 10-23] 

14 

[1111234-11-25] 

(111135-10-25] 

[111*24- 10-24] 

[1235-11-25] 


16 84 

16 - 4 * 

12*4 

12 - 8 * 

0 

[113-11] 

[112-10] 

[249] 

[14-10] 

1 

[13-11] 

[12-10] 

7(249] 

7(14-10] 

2 

* 29[14- 13] 

211(113-12] 

211(25-11] 

73(14-12] 

3 

37[ 113 • 12] 

059(123-11] 

119(25*10] 

667(25-11] 

4 

52139[225- 15] 

3071(123-14] 

3821(125-13] 

22697(125-14] 

r> 

9893[2 16- 15] 

2699(115-14] 

18667(136-13] 

4679(116-14] 

0 

299297[ 1130-171 

29593(1115-10] 

490283(1137-15] 

3257831(1137- 16J 

7 

2511043(11237-15] 

106301(11215- 14] 

193(1133-13] 

22177(1 127- 14] 

8 

3001 31 [1 1235- 19] 

62451 1(1 1135- 18] 

441773(1238-17] 

374281(1236-18] 

9 

1 7 3497[ 1 1237 • 19] 

884393(11230-18] 

24799(1238-17] 

204907(1238- 18] 

10 

58»[ 11208*21] 

1411 3(10236- 20] 

2047(1148-19] 

4783(1138-20] 

1 1 

29437[ 11348-20| 

2771(11138-19] 

677(1249-18] 

2251(1248-19] 

12 

3307(11348-231 

2579(11238-22] 

2707(224-10-21] 

1277(1339*22] 

13 

|10249-23J 

23(11238-22] 

23(224-10-21] 

61(1349-22] 

14 

[11349-25] 

[11239-24] 

[224-11-23] 

[1349-24] 


12 - 84 * 

12 - 4 * 

8*4 

8 * 4 * 

0 

[139] 

(128] 

[3-10] 

[29] 

1 

7[139] 

7(128] 

7(3-10] 

7(29] 

2 

227(14-11] 

47(3-10] 

235(4-12] 

9(11] 

3 

257(23-10] 

35(39] 

281(13-11] 

947(13-10] 

4 

97523(125-13] 

319(110-12] 

4177(112-14] 

1811(110-13] 

5 

245(5-13] 

09971(124-12] 

643(111-14] 

27737(113-13] 

0 

61219(1100- 15] 

3151259(1125-14] 

98797(124-16] 

1783141(125-15] 

7 

9479(1020-13] 

34037(1115- 12] 

907(114-14] 

20627(115-13] 

8 

12738433(1237- 17) 

1202051(1126-16] 

37151(215-18] 

1772417(225-17] 

9 

46223[1136* 17] 

115709(1126-10] 

228503(236-18] 

547889(226- 17] 

10 

53593(238- 19] 

9841(1125-18] 

50333(236-20] 

151331(226- 19J 

11 

1937(1228-18] 

1823(1127- 17] 

391(137-19] 

127(223-18] 

12 

1571(1238-21] 

157(1028-20] 

1163(336-22] 

2329(227-21] 

13 

53(1239-21] 

[1117-20] 

[325-22] 

37(227-21] 

14 

[1239-23] 

[1129-22] 

[337-24] 

[227-23] 



110 Computation of universal moments of tests of statistical normality 


Order 

84* 

4 7 

0 

[18] 

[7] 

1 

7[18] 

7[7] 

2 * 

261[2* 10] 

259L19] 

3 

1061[12-9] 

77[8] 

4 

182843[113-12J 

2107[1 • 1 1] 

5 

24391[103* 12] 

1603[1 • 1 1] 

6 

127741 [1 04 * 14 J 

2977 l [3 • 13] 

7 

1313[4* 12J 

2609[3 • 11] 

8 

423697[115* 16] 

29771[4* 15] 

9 

54827[11516J 

1603[3' 15) 

10 

11455[106* 18] 

2107[4* 17] 

11 

47[6* 17] 

77[4- 16] 

12 

95L106-20] 

269L6 19] 

13 

29[117 • 20] 

7[0- 19] 

14 

LI 17 -22] 

L7-21] 


REFERENCES 

Craig, C. C. (1928). Mctron , 7, 3. 

Fisher, R. A. (1929). Proc. Land . £oc. (2), 30, 199. 

Fisher, R. A. (1930). Proc . tfoy. AW.. A, 130, 16. 

Fisher, R. A. & Wishart, J. (1931). Proc . Land. Math. Hoc. (2), 33, 195. 
Geary, R. C. (1933). Biometrika , 25, 184. 

Hsu, C. T. & Lawijcy, D. N. (1940). Biometrika , 31, 238. 

McKay', A. T. (1933). Biometrika , 25, 411. 

Pepper, J. (1932). Biometrika , 24, 55. 

Wishart, J. (1930). BiomHrika , 22, 224. 



[ 111 ] 


THE ASYMPTOTICAL DISTRIBUTION OF RANGE 
IN SAMPLES FROM A NORMAL POPULATION 

By G. ELFVING, Helsingfors 

1. Introductory. Consider a sample of n observations, taken from an infinite normal 
population with the mean 0 and the standard deviation 1. Let a be the smallest and b the 
greatest of the observed values. Then w = b - a is the range of the sample. 

For certain statistical purposes knowledge of the sampling distribution of range is needed. 
The distribution function, however, involves a rather complicated integral, whose exact 
calculation is, for n > 2, impossible. Tippett (1925), E. S. Pearson (1926, 1932) and McKay 
& Pearson (1933) have studied and calculated the mean, the standard deviation and the 
Pearson constants fi v /? 2 of the range. Fitting appropriate Pearson curves to the distribution 
by means of these parameters, Pearson ( 1 932) has computed approximate percentage points 
for it. Later on, Hartley <( 1942) and Hartley & Pearson ( 1 942) have, by numerical integration, 
tabulated the distribution function for n = 2, ..., 20. 

As pointed out by Pearson, the distribution of range is very sensitive to departures from 
normality in the tails of the parental distribution. The effect of such departures becoming 
more perceptible for increasing n, the practical importance of the range distribution is, 
perhaps, small for large samples. Nevertheless, it seems to be at least of theoretical interest 
to investigate the asymptotical distribution of range for n->oo. This is the purpose of the 
present paper.* The results are summarized in a theorem at the end of the inquiry. 

2. The exact distribution. Transformations. The joint-frequency function of the extremes 
a, b reads, as well known, 

/*(M) = n(n-l)#i^(6)^ (2*1) 

(of. e.g. Cramer, J 945, p. 370). Let u = £(a + b) denote the arithmetical mean of the extreme 
values of the sample. Making in (2-1) the transformation a = u — £w, b — u + iw and 
integrating with respect to ?/, we find for the frequency function of the range the expression 

f w (w) = n(n — 1 ) f <f)(u — he) <f>(n 4- \w) [&(v 4- he) — 0(u — \w)] n ~ 2 du. (2*2) 

J — CO 

The object of our inquiry is the limiting form of the distribution (2*2). It proves, however, 
more advantageous to pass to the limit in the joint distribution of a, b or u, w, before inte- 
grating with respect to u . 

The asymptotical distribution of a and b has been investigated by Fisher & Tippett (1928), 
and Cl umbel (1936) (cf. also Cram6r, 1945, p. 376). According to these authors, we have 

E( u) - 0, 7>(u) = 0(log~*n), 

K(w) = 2V(21og«.)+o(~^^), D(w) = 0(log~‘»). 3) 

From the formulae quoted it is seen that u-*0, w->o o in probability as w-*oo. Our first 
task must, consequently, be a transformation of the variables a, b — or u, w — depending on 
n and intended to stabilize the probability mass, in order to provide a limiting distribution. 

* Prof. H. Wold has kindly directed my attention to this problem. 

t &(x) denotes the distribution function and <f>(x) =<P'(x) the frequency function of the normal 
distribution with mean at .c=() and unit standard deviation. 



112 Asymptotical distribution of range in samples from a normal population 

Following the example of the authors mentioned above, we should have to introduce the 
new variables a ' = n 0( a ) t b ' = n&{ - b). 

For our purpose it proves, however, advantageous to subject a' and b' to a new transforma- 
tion, independent of n, taking 


Conversely, 


xe y — 2n0(a) = 2 n&( - J w + u), 

xe 7 = 2 n&( - b) = 2 n&( - |w-u) 


:} 


2nJ\0(a)0(-b)] 

0(a) 


y - I I° g 0( _ b) - 2 0( _ Jw - U) • 

As agb and thus 0(a) 4- 0( — b) £ 1, it follows from (2-4), that x, y are subjected to the 


Hog 


2 nyj[0( - Jw + u) 0( - Jw - u)],] 

0(-|W + U) 


(2-4) 


(2-5) 


restrictions x § 0, x cosh y <n. 

Performing the transformation, we find 

I d(a, b) 


d(x,y) 


2 n 2 <f>(a) <j>(b)’ 


and thus, letting f n (x,y) denote the joint-frequency function of x,y, 

a; cosh y \ ll ~ 2 




( 2 - 0 ) 


(2-7) 


( 2 - 8 ) 


This formula is valid in the region (2*6); outside of it, we have to put f n (x,y) = 0. 

The new variables x, y depend, of course, on u as well as w. It will, however, be shown later, 
that x, for large n , tends to coincide with the variable 

x* = 2 n&( - iw), 

which depends exclusively on w. For testing purposes, the former variable may thus, in 
large samples, be used as a substitute for the range. These considerations justify the trans- 
formation (2-4) as well as a closer study of the distribution of x and its limiting form. 

3. Limit passage and remainder term . The limiting form of the joint-frequency function 
(2-8) is immediately seen to be 

f(x,y) — %xe xcoshl/ (#;>0). (3-1) 

The integral of this function, taken over jfche whole half-plane x > 0, is easily seen to equal 1 ; 
(3*1) is, consequently, the frequency function of a well-determined two-dimensional dis- 
tribution. 

Let the marginal distribution functions in x y corresponding to (2*8) and (3*1), be denoted 
by F n (x) and F(x) respectively. Our next task will be to estimate the remainder 
| F n (x) — F(x) |, which is, obviously, at most equal to the integral 


= f f 2 1 /»(£■ v) -/(£» v) I 

JO JO 


(3*2) 


To begin with, we estimate the quotient fjf upwards. By differentiation with respect 
to the variable z = a; cosh*/, this quotient is found to attain the maximum value 


(l _ 1 Wl_ 2 V' "%* = 1+1 + £>(.-*) 

\ nj \ n) n \n*j 


for 2=2. We thus find, for example, 

L 


^ i i 


3 



G. Elfving 


113 


For the further estimations, it proves necessary to divide the domain of integration in (3-2) 
into an interior and an exterior part by means of a convenient abscissa y = y. In order to 


secure i 


i the Maclaurin expansion of log ^1 — ~ £ cosh y j within the interior region, we have to 

*c cosh t/ 

choose y so as to satisfy the inequality < k with an appropriate k < 1. Taking, for 


simplicity, k — 1—^/1 and observing that cosh y g e v , we see that the condition mentioned 
is fulfilled if „ 

(3-4) 

JO 

Now we may estimate fjf downwards in the interior domain of integration. Expanding 
log|l — igcosh^j, we find 

log = log^l -f ^gcosh?; — cosh 2 y 1 1 — " j (0 < t9* < 1). (3*5) 

According to the determination of y, the remainder factor is seen to be <2 for £ ^ x> y ^ y. 
For we have 1og|l — ~ . Omitting, further, the positive term in (3*5) and 

replacing n — 2 by n, we find 

_f + Pcosh 2 77 4 

My) g Mv) * ’ 


r | + £ 2 cosh 2 y 

-1|< (£^z, w< t y; n^5). 

n 


(3-6) 


hence, combining with (3*3), 

!/„(£. 7) 
i /(£>>/) 

In the exterior domain of integration, (3-3) directly yields 

I /»(£.*) -/(£,*) I </(€,*) (3-7) 

We proceed to the estimation of the integral (3*2), denoting its interior and exterior part 
by /j and / 2 respectively. For the former we have, according to (3-6), the inequality 


W;j 7y 


1 i ~fd£,dy < 


:/:/> 


£ + £ 3 cosh 2 >/) e~£°° 8h * d^dy, 


(3*8) 


for the latter, according to (3*7), 

n oo fjr I*ar> 

2\L-f\d£d,,< &-^UUv- 

u J o J u 

The integration with respect to £ may be explicitly performed. We have, in fact, putting 

for brevity cosh y — a, 


(3-9) 


£« <d£, = 2 {1 -e-° x f 1 +ax]}, 


j: 

- “i ‘ + “+ 1 + L 7-]) 


(3*10) 

(3*11) 


In order to deduce remainder formulas for (a) moderate, (6) small x, we omit in (3*10) and 
(3*11), (a) all the negative terms, (b) the terms witli x 2 and # 3 . According to the Maclaurin 


expansion 


e°*:=l +az + e iax ~ a ~ (0<#<1), 
Z 


Uioinetrika 34 


8 



114 Asymptotical distribution of range in samples from a normal population 

the expression in curled brackets in (3-10) is at most equal to $a 2 x®. Inserting these 
estimations in (3-8), we obtain for the interior integral the inequalities 


, is r 

1 < 2nJ , 


l5 r v „%_ = L 5 tghv< i 6 

0 cosh 2 17 2n gV 2n 


h< 


15 
4 n"‘ 


a \ dv < — y- 

Jo » 


For the exterior integral, (3-10) yields 


h< 


Too 

J V 


dy 


= 1 — tghj/< 2e~ 2v . 


(3*12a) 

(3*126) 

(3*13) 


cosh 2 ij 

Finally, we have to join the results (3-12) and (3*13). Combining, first, (3*12a) with (3*13) 
and determining e~ y from (3*4) (taken with the equality sign), we obtain, after some slight 
simplifications in the numerical coefficients, 


4 8 /i 3 * 2 \ 


'• = 5). 


(3-14o) 


Combining, on the other hand, (3*126) with (3*13), we find 

4rr 2 

A n <- -y + 2e~ 21 '. 
n 


yju 


This expression attains, for fixed x and n , its minimum when y — log- . For n ^ 12, this 

x 

value of y also satisfies (3*4), and we obtain, as a parallel estimate to (3* 14a), 


4 , < - 


4x 2 


n 


(log^ + i) 


(n£l 2). 


(3*146) 


The formulas (3*l4a,6) are both valid for all positive x and all 12. 


4. The asymptotical distribution . Having established the limiting distribution of the 
variable x defined in (2*5), we are going to examine its properties. 

The frequency function of the distribution considered reads, according to (3* I ), 


f(x) = xj e~ XC0Bhv dy _ x j 


o- xt 


W-if- 


(4-1) 


Changing the order of integration, we easily find the distribution function, the mean and 
the variance of (4-1) to be 

m - 1 -„C - 1 (4n 

•®(x) = %n, 2> 2 (x) = 4-iir 2 . (4-2) 

The numerical evaluation of the distribution is much simplified by, the fact that f(x) as 
well as F(x) is closely connected with certain Bessel functions. Denote 


<j>(x) = J°° e~ xco * bv dy = J°° 


P -Xt 


:dt. 


(4-3) 


1) 

By differentiation and partial integration, this function is found to satisfy the differential 

equation i 

<f>"(x) + - <f>’(x) - 0(x) = 0. 


(4-4) 



G. Elfving 


115 


Changing x into —ix, we obtain for the function i/r(x) = <f>{ — ix) the equation 


i/r''(x) + -ijr'(x) + i/r(x) = 0; 


(4-4') 


hence, \Jr(x) is a Bessel function of order zero. 

In order to specify this function, we will deduoe an asymptotical expression for the 
function (4-3), valid for large x. For this purpose, we make in the latter integral (4-3) the 
substitution < = 1 +ujx and write 


(4-5) 


Performing the integration, we obtain 

which shows that the Bessel function \Jr(x) = <j>( — ix) tendg to zero for x -f i oo. This function 
is, consequently, proportional to the Hankel function H^(x) (cf. Jahnke-Emde, 1909, p. 94). 
Comparing the asymptotical expressions of (f)(x ) and iH^\ix), we find the proportional 
factor to be in, whence 

f(x) = x™H£\ix). (4*6) 

We proceed to the calculation of F(x). Every integral of xll^ l \x) is (cf. Jahnke-Emde, 
p. 165) of the form xH[ l) (x) + Const., where H{ l) (x) is the first order Hankel function corre- 
sponding to i/d 1 ^#); consequently, 

nx 


F(x) = '^HP(ix) + C. 


nx 


Now H[ l) (ix) tends to zero as — (lnx)*er x for x->cc (cf. Jahnke-Emde, 1909, p. 101); 
hence C = 1 and 


F(x)= l-x[-|//<»(tx)]. 
For small x , F(x) has the expansion 


F(x) = | log 


yx^2/ 2 ' r ( 1 ° g 16 


2 l\x 2 


. + • • • > 


(4-7) 


(4-8) 


where 


log-" = 0-11593.... 
J 


(4-9) 


The factors of x in (4-6) and (4-7) are tabulated in Jahnke-Emde (1909, pp. 135-6). 
Below r , we give a short table of f(x) and F(x). The corresponding curves are seen in Fig. 1. 


5. Connexion between the variable x and the, range. We now turn back to the original 
object of our inquiry: the asymptotical distribution of the range. 

Consider the variable x = 2 n <][&( - + u) 0( — £w — u)] (5*1) 

introduced in (2*4). As mentioned earlier, 

w-^oo, u -> 0 in probability (n->oo). (5*2) 

Under such circumstances, for large n, x may be expected to behave substantially as the 
variable x * _ 2 n<P( - £w), (5*3) 

which depends exclusively on the, range . 


8-2 



1 16 Asymptotical distribution of range in samples from a normal population 

We shall now prove that x*/x-> 1 in probability as n-*oo. According to the well-known 
asymptotic formula 

*<-*>= ( * >o); o(<(><,) ' 
we may, for | u | < £w, write 

{1 + 0[(l w - I u I )-*]}. 


X 

fix) 

F ( x ) 

X 

/<*) 

F ( x ) 

0*0 

0*0000 

0*0000 

1*5 

0*3207 

0-5839 

01 

0*2427 

0*0146 

2*0 

0*2278 

0-7202 

0*2 

0*3505 

0*0448 

2*5 

0*1559 

0-8153 

0-3 

0*4118 

0*0832 

30 

0*1042 

0-8795 

0*4 

0*4458 

0*1262 

4*0 i 

0*0446 

0-9501 

0-5 

0*4622 

0*1718 

5*0 ! 

0*0185 

0-9798 

0*6 

0*4665 

0*2183 

6*0 | 

0*0075 

0-9919 

0*7 

0*4624 

! 0*2648 

7*0 

0*0030 

0-9968 

0*8 

0*4522 

0*3106 

8*0 

0*0012 

0-9988 

0*9 

0*4380 

0*3552 

9*0 

0*0005 

0-9995 

1*0 

0*4210 

! 0*3981 

10*0 

0*0002 

0-9998 



Fig . 1 


Given an arbitrary e>0, we obviously may find two positive numbers u € and w f ( >u e ) 


such that 


x* 

-—Ice if w ^ w f , | u | ^ u e . 


(5-4) 


On account of (5*2), we may, on the other hand, choose n e so that the probability of the 
simultaneous validity of the latter inequalities in (5*4) exceeds 1 — e if n ^ n t . Consequently, 


|£-i 

<.) 

\ x 

1 


which proves our statement. 



G. Elfvino 


117 


As shown in section 3, the distribution function F n (x) of x converges to F(x) as n->oo. 
Since F( 0) = 0, it follows from (5*5), by a well-known method of argument, that the dis- 
tribution function F*(x) of x* converges to the same limiting function. The asymptotical 
distribution of the range, suitably transformed, is hereby established. 

For practical purposes, it would, of course, be desirable to possess a reasonably accurate 
estimate of the remainder F*(x) — F(x ), or at least an estimate of the difference F*(x) — F n (x) t 
to be combined with the results (314). 

For n — 20, the accuracy of F(x) as substitute for F*(x) may be checked by means of 
Hartley’s ( 1 942) tables. The discrepancy amounts to about 0*004 for x = 0* 1 , 0*025 for x = 1 
and 0*010 for x = 4. 

The theoretical evaluation of F*(x) — F(x) seems to be somewhat complicated and, besides, 
of little use since x*, for most purposes, may be replaced by x. A few remarks concerning 
the relations between x, x* and their distribution functions will, however, be added below. 

To begin with, we note that always x^ x*, the equality sign being valid only if u = 0. 
Consider, in fact, the function x(u), defined by (5* 1 ) for a fixed w . Inserting for 0 its analytical 
expression, we easily find that l/ 2) logx(u) g 0 for all u. Hence, x(u) has no minimum and 
at most one maximum, and the latter is, by symmetry, seen to be attained for u — 0, being 
thus equal to x*. 

From xgx*, it follows that F*(x) g F„(x) for all #. We will show that the difference 
F n (x) — F*(x) may be expressed as a double integral. 

The variables u and w are, according to (2*4), well -determined 
functions of x and y in the region (2*6); and so is the variable x*, 
on account of (5*3). 

On the level curve x* = r 0 , w has a constant value w 0 , determined 
2 n<J>( - hw 0 ) = x 0 , (5-6) 

and this curve is, consequently, given in parametric form by the 
equations 

x = 'In s l[0( - ±w 0 +u)&(- £ir 0 - «)j, y = { log , (5-7) 

where u runs through all values from — oo to -foe. The latter 
function (5*7) being, obviously, monotonously increasing, we may 
imagine u eliminated, writing (5*7) in the form 

# = £„(* 0 ,;iy) ( oo<i/<go). (5*7') 

From the proof of the inequality x g x* given above, it follows 
that the function (5*7') has a single maximum for y = 0. When 
y > ± oo, the function obviously tends to zero. 

The inequality x* ^ x 0 is fulfilled on the left side of the curve (5*7'), the inequality x x 0 
on the left side of the straight line x — x 0 . Let us for brevity denote the regions (cf. fig. 2) 

0 S x ^ £ n (x 0 , y), £.,(*. , y) < x g * 0 (5*8) 

by A n (x 0 ) and J3 n (# 0 ) respectively. The difference F n (x 0 ) — F*(x 0 ) is, then, the probability 

of the points x, y falling within the region J3 n (# 0 ). Dropping the indices 0, we thus obtain 

the expression sought for r r 

KW ~K(*)= /»(£. V) didy. 

J J U,(X) 



(5-9) 



118 Asymptotical distribution of range in samples from a normal population 

Comparing, finally, the transformed range distribution function F*(x) directly with its 
limiting form F(x), we find 


F*(x) - F(x) = [J n (*) - *(*)] - [*„(*) - J’JK*)] 

= ff (f n -f)d£d V -(t f n d£dy 

JJe&x JJu^x) 

= ff {f,-f)did V - ff fdgdr,. 

J J •‘dn(^) J J -&»(•£) 


(5-10) 


The former integral is, obviously, at most equal to the remainder expression A n in (3-2), 
estimated in (3-14). 


6. Conclusion. Our main results may be summarized in the following theorem: 

Theorem. Consider a sample of n observations from an infinite normal population with 
mean 0 and standard deviation 1 . Let a be the smallest, b the greatest of the observed values, 
and put 

x = 2n Vl.0(a) 0( - b)], x* = 2 n0 



the latter variable being evidently a simple transformation of the range of the sample. Then 

(1) xgx*; x*/x->-l in probability (n-> oo). 

(2) The distribution functions F n (x) and F*(x) of x and x* tend, for »->oo, to the 

common limit ^ . 


where Hl l) (z) is the first order Bessel function, which vanishes as 

(3) For n ^ 12, F n (x) satisfies the inequalities 




iz for 2 + i oo. 


| FM - F(*> I < * ( 1 + ) . I K <*) - rw> | < **’ (iog> + y . 


7. Generalization. A great part of our conclusions does not presuppose the normality 
of the parental population. Thus, the distribution (2-8) of the variables x, y defined by (2-5) 
is the same for any continuous probability law and so, consequently, is its limiting form; 
however, if the parental distribution is non-symmetrical, with distribution function G(x), 
say, the factor 0( — b) in (2-5) must, of course, be replaced by 1 — (2(b) instead of G( — b), 
and the variable x* is to be defined by 

x* = 27iV{G(-iw)[i~r;(iw)i}. 

The proof of the statement x*/x->l requires, however, convenient assumptions con- 
cerning the parental distribution. It can be proved that the assertion mentioned — and, 
consequently, the theorem stated above — are valid if the frequency function of this dis- 
tribution is of the form - _ 

g(x) = Cex kl^J, 


where 1 <p£ 2. 



G. Elfving 


119 


REFERENCES 

Cramer, H. (1945). Mathematical Methods of Statistics. Uppsala. 

Fisher, R, A. A Tippett, L. H. C. (1928). Limiting forms of the frequency distribution of the largest 
or smallest member of a sample. Proc. Comb. Phil. Soc. 24 , 180. 

Gumbbl, E. J. (1936). Lee valours extremes dee distribution statistiques. Arm. Inst. Poincare, 5, 
115-68. 

Hartley, H. O. (1942). The range in random samples. Biomctrika , 32, 334-48. 

Hartley, H. 0. & Pearson, E. S. (1942). The probability integral of the range in samples of n observa- 
tions from a normal population. Biornetrika , 32 , 301-10. 

Jahnke, E. & Emde, F. (1909). Funktionentafeln. Leipzig and Berlin. 

McKay, A. T. & Pearson, E. 8. ( 1 933). A note on the distribution of range in samples of n. Biornetrika , 
25, 416 20. 

Pearson, E. 8. (1926). A further note on the distribution of range in samples, taken from a normal 
population. Biornetrika , 18, 173-94. 

Pearson, E. 8. (1932). The percentage limits for the distribution of range in samples from a normal 
population. Biornetrika , 24 , 404-17. 

Tippett, L. H. C. (1925). On the extreme individuals and the range of samples taken from a normal 
population. Biornetrika , 17 , 364-87. 



[ 120 J 


LIMITS OF THE RATIO OF MEAN RANGE TO 
STANDARD DEVIATION* 

By R. L. PLACKETT, B.A. 

The ratio of mean range w n in samples of n to population standard deviation cr, which 
has been denoted by d n , is used in control chart work (when the population is assumed 
normal) to estimate <r from the ranges of a set of small samples. On comparing the series of 
values of d n for different n when the parent population is rectangular with the series when 
it is normal (see table below), it is clear that for 12 the two series agree to within less 
than 10%. With this in mind, the question arises: what are the limiting values of d n for 
a given n? It is shown here that populations exist for which d n is arbitrarily near to zero, 
while for no population will d n exceed the value 




We consider a population whose distribution function is F(x) and which extends from 
— a to -fa so that F( — a) = 0 and F(a) = 1. The population in the first place may have 
any finite limits, but there is no loss in generality in supposing these. It is required to find 
limits to the ratio 

( 1 ) 

r 


J" [1 - F" - (1 - F) u ]dx 

[!>■" IJ>!l 


i„ = 


We apply the calculus of variations and find the extremes of d n in the class of functions F 
such that F( — a) = 0 and F(a ) = 1 ; the case is thus one of fixed end-points. Suppose that 
F(x) = u(x) gives an extreme value and form the functions F(x) = u(x) + tv(x); for t suitably 
near to zero, all these will be permissible distribution functions, i.e. monotonically increasing, 
provided v( — a) = v(a) = 0 . Then for t — 0 , djdt(d n ) is zero for all functions v(x). 


Since 


d., = 


\dt {dn) lo 


= 0 . 


J [1 — (u + tv)' 1 — ( 1 — u — <?;)"] dx 
x 2 (m.' + tv') dx - (J n x(v' + 1r‘) rfa:)^ J 1 

2»|^j x 2 u'dx- ( J xu'dx'j J J (l-uy-ivdx - J J 

-[J“ (l-« n -(l-«)' l )dxJQ“ xH'dx- 2 {L ru ' dr )(Lr' dx )l J 

2^J" xH'dx-{J' xu'dx^ 


Now 



x 2 v'dx 



xvdx since v(a) — v( — a) = 0, and by the same condition 



xv'dx = 



vdx. 

a 


* Communication from the National Physical Laboratory. 



R. L. Plackett 


151 


The numerator now becomes of the form J s(x) v{x) dx, and this must be zero for all 
functions v(x)\ it is therefore concluded that s(x) is identically equal to zero. In fact 

n£j* x2u 'd x —^j xu'dx^ [(1 — u) n ~ l — u' 1 " 1 ] 

= £ J { 1 — u n — ( 1 — u) n ] dx J xu f dx — , 

so that if [i is the mean, a the standard deviation, w n the mean range in samples of n , and 
F(x) the distribution function of the population which gives an extreme value to d n , we have 

w n (x—/i) = ncr\F n ~ v — — F) n ~ l ]. 

Put x = — a and obtain no* 2 = w n (fi+a)\ x = a gives ncr 2 — whence / 1 = 0 and 

a: = al^ 1 — (1 — JP’)”- 1 ]. (2) 

This distribution must give an upper limit to d n since if we consider a distribution of the 
type below: 


Area ( l — 2 y) 


j Area // I Area y 
-i 0 + £ 


F(x) = (2x -f l)y — 


F(x) = 1 — ( I — 2x) // 0 < x < \ 


the ratio (l) for y = 0(n 3 ) is approximately v /(3/2)n x /i/ which can be made as small as we 
please. 

Reverting therefore to equation (2) we note that since aw u = ncr 2 , d AJ (max.) = ncrja, 

= a 2 J" [ F "~ l - ( 1 - i’)" -1 ] 2 dF. 

Therefore <t 2 /7i 2 = — — 2B(n, n), 

i.e. «/„(max.) = 1) ,{(2w-2)!-L(»-l)!] 2 }). (3) 

It is of interest to note that all the foregoing analysis may be carried out with a equal to 
any finite value and so we may take the limit as a go , and equation (3), which is independent 
of a, will still hold. 

It is easy to verify, by Stirling’s formula or otherwise, that as n increases L(?i — 1 ) !] 2 
becomes negligible compared with (2n - 2)!. 

Consequently, for large n, d n (max.) ^{2/(271— 1)} 


A/( TC + 2-l/n)’ 


b. rf K (max.)— \/(n + !). 

The probability density function of (2) is obtained by differentiation and is 


/(*) = 


a(n -1 j[F”-* + (1 - F)”- 2 ] ’ 


( 6 ) 



122 


Limits of the ratio of mean range to standard deviation 


so that (2) and (5) are the parametric equations of the curve in terms of its distribution 


function. Thus for n > 2,/(0) = 


2»-s 

a(n- 


! j and f(±a) = - — . The distributions (2) are readily 


seen to be unimodal and symmetrical about x = 0. For n — 2,3 they are rectangular. For 

X 1 

F>i and large n, aF n ~ 1 ^x. Hence F n ~ 2 ^-, /(x)*= . Similar considerations for 

CL X\7t 1 ) 

F<£ show that for large n and x / 0, 

f(x) ~\x\(n-iy 

From (4), a~aj<Jn. Consequently, for any finite a, as n-+co the distributions (2) tend to 
a single ordinate at x = 0. This should be compared with the limiting case giving d n ->0 
for fixed n illustrated with the diagram above. The limiting form of the two distributions is 
the same but the approach to the limit with increasing n is quite different. There is no 
approach to normality. 

Following is a table of d 7 ,(max.) and of d n in samples from normal and rectangular popula- 
tions for n = 2, ..., 12. The quantity yj(n + £) is also included to see how closely (4) is 
approximated. The values of d n (normal) are obtained from the paper by E. S. Pearson 
( 1 942). For a rectangular distribution d n is simply 2 sj3(n - 1 )j(n + 1 ). 


n 

V( n + 4) 

d„ (max.) 

<l n (normal) 

d n (rectangular) 

2 

1-58114 

1-15470 

1-128 

1-15470 

3 

1-87083 

1-73205 

1-693 

1-73205 

4 

2-12132 

208395 

2059 

2-07846 

5 

2-34521 

2-34013 

2-326 

2-30940 

6 

2-54951 

2-55333 

2-534 

2-47436 

7 

2-73861 

2-74414 

2-704 

2-59808 

8 

2-91548 

2-92076 

2-847 

2-69430 

9 

3-08221 

3-08685 

2-970 

2-77128 

10 

3-24037 

3-24440 

3-078 

2-83426 

11 

3-39116 

3-39466 

3-173 

2-88675 

12 

3-53553 

3-53860 

3-258 

2-93116 


Some values of d n for a number of symmetrical populations were given by Pearson 
& Adyanthaya (1928) and have been reproduced with some figures for one skew population 
in Tables for Statisticians and Biometricians , Part II, Table XXIII. The majority of these 
values were obtained empirically from random sampling experiments. These values were 
of course subject to sampling error and for this reason are in three cases very slightly 
above d n (max.). 


Some of the preceding work was done as part of the Research and Development programme 
of the Ministry of Supply (S.R.17) and appears by permission of the Chief Scientific Officer. 
It was completed as part of the research programme of the National Physical Laboratory, 
and this paper is published by permission of the Director of the Laboratory. 

REFERENCES 

Pearson, E. 8. (1942). The probability integral of the range in samples of n observations from 
a normal population. Biometrika , 32, 301-10. 

Pearson, E. 8 . & Adyanthaya, N. K. (1928). The distribution of frequency constants in small samples 
from symmetrical populations. Biometrika , 20 A, 356-60. 




[ 123 ] 


' SIGNIFICANCE TESTS FOB 2x2 TABLES 
By G. A. BARNARD, Imperial College 
Pabt I 

The theory of statistical significance tests deals with abstractions of experimental results. 
The fact that the figures dealt with may happen to be tensile strengths of iron bars, or 
perhaps weights of babies, is ignored in the carrying out of the test; and for the purpose of 
statistical theory the experiment in question could just as well be represented by an experi- 
ment involving the drawing of balls from urns. In fact, it is an advantage, from some points 
of view, to replace the concrete experiment involved in a particular practical case by an 
4 abstract ’ urn-experiment, in order to retain in view only those features of the case which 
can be dealt with by statistical methods. 

It is obvious enough that the first step in the statistical treatment of an experimental 
result may be represented as the replacement of the concrete experiment by an ‘urn- 
experiment ’ ; but the implications of this have not always had the continuous attention they 
deserve. Once the abstract picture has been formed, the analysis of it is largely a matter 
of pure mathematics. What distinguishes the statistician from the pure mathematician, in 
this connexion, should be the statistician’s ability to form valid abstract pictures of concrete 
cases, and his clear recognition of the limits of validity of his abstract pictures. Yet we find 
relatively little discussion in statistical text-books of the process of formation of these 
abstract pictures. 

It is the purpose of the first part of this paper to draw attention to the confusion which 
may arise through the possible formation of several different abstract pictures, each of 
which may apply to some concrete cases, though not to others. 

Suppose we are given two mass-production processes, A and B, and we wish to test 
whether process A and process B are equally satisfactory, in the sense that neither process 
is more likely to produce defective items than the other. For this purpose we take, say, 
m articles made by process A, and n made by process J3, and test them, under suitable con- 
ditions. We find that a out of the m articles are defective,, while b out of the n articles are 
defective, a result which can be represented in the form of a 2 x 2 table (Table 1). 


Table l 



I (defective) 

II (non -defective) 

Total 

Process A 

a 

c 

m 

Process B 

b 

d 

n 

Total 

r 

i 

8 

N 


The statistical analysis of results of this type has been much discussed, but it seems to 
have escaped notice that, on the facts incompletely stated as above, it is possible to form 
several different abstract pictures, any one of which might be appropriate to the real case 
in question. The adoption of one picture rather than another will depend, in a given case, 
on further knowledge which is not specified above. 



124 


Significance tests for 2x2 tables 


The basis of Fisher’s ‘ exact ' test 

The current generally accepted test for results of the above type is that given by Fisher 
( 1 94 1 ), or some approximation to it. The simplest abstract picture * to which this test corre- 
sponds would seem to be one in which the m articles made by process A and the n articles 
made by process B are represented by N similar balls, m of them marked A and n marked B. 
The N balls are put into an urn, and then withdrawn in random order. As they are withdrawn, 
the balls are placed, in order, in a row of N receptacles, r of which have been marked ‘I’, 
the remainder being marked ‘II ’. The result of Table 1 then represents the observation that 
a of the balls marked A are in receptacles marked ‘I’. The probability of such a result, in 
such an experiment is m\n\r\s\ 

N\a\b\c\d\ (1) 


which can be seen by considering that the contents of the r receptacles marked ‘1’ form a 
sample of r from an urn containing m balls marked A and n balls marked B , the sampling 
being done without replacement. The probability (1), added to those of all results less 
probable than that obtained, is the basis of Fisher’s test. 

In the concrete case given, the N balls, initially similar, may be taken to correspond with 
the N items of raw materials. The process of labelling the balls A and B corresponds to the 
selection of m of the items of raw material, and their fabrication into articles by process A , 
and the fabrication of the n remaining ones by process B. The N receptacles into which the 
balls are eventually placed then represent the N ‘test, occasions’ which must be provided 
for when the experiment is laid out. The faot that these receptacles are labelled ‘ V or ‘TT 
before the balls are placed in them corresponds to the assumption of the hypothesis being 
tested — that the processes do not differ in respect of liability to defectives, so that w hether 
or not a given article is defective has nothing to do with whether it is A or B. The labelling 
‘I’ or ‘II’ is thus assumed independent of the labelling of the balls. Finally, the random 
allocation of balls to receptacles corresponds to a precaution w hich might have been taken in 
the concrete case, viz. the random order of test of the article secured by the use of random 
numbers or the like. 

The basis of the C.S.M. test 

Another abstract picture, also applicable to the concrete case as incompletely described 
above, forms the basis of the test to be developed in the later part of this paper, which we 
have called the C.S.M. test. In this picture, the tw o processes A and B , are represented by two 
urns, A and B , each urn containing a large number of balls, some of which are marked ‘1 \ 
while the others are marked ‘ II ’ . The selection for test of m articles of process A is represented 
by the random drawing of m balls from urn A ; and similarly for the n articles of process B . 
The test procedure corresponds to the examination of the balls, to see whether they are 
marked ‘1’ or ‘11\ The liability of process A to produce defectives is represented by the 
proportion p a of balls marked ‘I’ in urn A , while p h similarly represents the liability of 
process J3. The hypothesis we wish to test says that p a — pb = p , say. The probability of a 
result such as that of Table 1 is very nearly 


Hfh * I 

ofc !^ 1 ~P“ ,C * bUll ^ 1 ~I > bY 


(2) 


* Though not the only possible one. By following Fisher’s argument, us given in his book, one can 
construct a more complicated picture which leads to a similar result. 



G. A. Barnard 


125 


which, on the hypothesis tested, becomes 


m\n\ 

a\~b\c\d\ 


p r {l-p)\ 


We may notice that the expression (3) differs from (1) by a factor 


N\ 

r!s! 


P r (l-P)* 



and it would have been obtained in the earlier case if we had assumed that the labelling of 
the receptacles was itself done randomly, by selection of N labels from a box containing a 
large number of labels, the proportion marked 4 1 * being p. 

To justify the application of our second picture to a concrete case, we should have to be 
satisfied that the conditions of process A and those of process B were sufficiently stable, in 
a statistical sense, to justify the formation of the notions corresponding to p a and p b . We 
should further have to make sure that our selection of samples of m and n respectively was 
for practical purposes random. And finally, we should have to be reasonably sure that the 
conditions of test themselves had practically no influence on the results of the test — that the 
test used revealed a real property of the article tested, rather than a property of the in- 
dividual conditions of test. 


Another type of abstract experiment 

Another case of common occurrence may be represented by a single urn, containing balls 
each of which carries two marks — one mark being either A or B , the other mark being 
either 4 1’ or ‘11’. The experiment consists in drawing N balls from the urn, at random, and 
examining their markings. If the proportion of balls marked "A I ’ is p aV while p blt p a2i p b2 
similarly represent the proportions of the other markings in the urn, the probability asso- 
ciated with Table 1 in this case is 

o ! blcAd ! p “ lP ^ VaiP'i 2 ( 4 ) 

by the multinomial theorem, provided the number of balls in the turn is large. In this case 
the hypothesis tested, that the markings k l ' and 4 II ’ on the one hand, and the markings A 
and B on the other, are independent, may be put in the form 

PalPb2 = PatPbl 

and. assuming that. (p,„ +p„ 2 ) = p' and {p hl +p ba ) = 1 -p', and (p al +p bl ) = p and 
(Pat + Pbt) = 1 ~P> do not vanish, the probability of our result, on the hypothesis tested, 
can be expressed as 

a\b\c.\d\ lf(l - pr{p ' )m(l - p,)n (5) 

which differs from (3) by a factor 

J-,(pTO-pT. 

This shows that (5) is related to (3) in much the same way as (3) is related to (1). 

This situation could present itself in our concrete case if the articles made by the two 
processes A and B were mixed up together in a common store, and the test sample of N 
were randomly drawn from this store, the subsequent conditions being as in the second case. 
Statisticians with industrial experience may perhaps feel it is unlikely that the experiment 



126 


Significance tests for 2x2 tables 

would be performed in this way; but it must be admitted that it oould have been. Cases 
such as this seem to occur more frequently in biometric investigations, where a population 
of animals is being tested for the association or otherwise of two characters. 

Nomenclature 

The name ‘ double dichotomy * has been applied generally to all experiments leading to 
results of the form of Table I , but the foregoing analysis would suggest that it might be more 
appropriate to restrict this term to the third case we have indicated. Since the second case 
can be obtained from the third by supposing the numbers of articles made by process A 
and by process B to be fixed, we might then call the second case the (singly) restricted double 
dichotomy. Similarly, the first case would be called the doubly restricted double dichotomy. 
Such a nomenclature, apart from a lack of euphony, would be open to the objection that it 
would tend to imply that the third case was the general one, the first two being derivatives 
of it. This, in turn, would imply that the subject-matter of our investigation in cases one and 
two was in reality a four-fold universe, the restrictions on numbers being merely matters of 
experimental technique. But such is not always the case. The question implied in our second 
case presupposes two two-fold populations, which are to be compared, and no four-fold 
super-population need exist for this question to have meaning. 

We therefore propose the names ‘ double dichotomy ’ for the third case, ‘2x2 comparative 
trial ’ for the second case, and ‘2x2 independence trial ’ for the first case, though here again 
an objection on aesthetic grounds would be easy to sustain. 

Finer distinctions 

In principle it could be maintained that there is a distinction between the 2x2 compara- 
tive trial, as instanced above, and a restricted double dichotomy. As we have said, the funda- 
mental subject-matter of a 2 x 2 comparative trial is a pair of populations; while the subject- 
matter of a restricted double dichotomy is a four-fold population from which we happen, by 
an accident of experimental technique, to be able to extract samples in w hich the numbers 
of items having certain characteristics are fixed. The latter case could arise, for example, 
if an attempt was being made to discover association between colour of eyes in school- 
children and some less easily identified characteristic, such as membership of a particular 
blood-group. We could imagine that an experimenter might pick out m children with (say) 
blue eyes, and n without blue eyes, and then, having obtained his samples, he might subject 
them to a test for blood-group. The conclusions drawn from such an experiment would 
presumably be intended to apply to the population of school-children, a four-fold one relative 
to the two characteristics in question. The distinction between the two cases comes out if 
we consider what happened if, in the 2x2 comparative trial, all items tested turn out to be 
defective. Jn this case we should say that our question, whether = p b or not, tends to be 
answered in the affirmative. In the case of the school-children, if they all turn out to have 
the same blood-group, then no conclusion on our question about the four-fold population 
can be drawn at all. 

Similar distinctions apply to the 2x2 independence trial. In the psycho-physical experi- 
ment described by Fisher (1942), where the point at issue iR whether or not a lady can tell 
whether the milk or the tea has been put in the cup first, no statistical population is pre- 
supposed. The question would have meaning even if we refused to regard the order of in- 
sertion of milk or tea as ever being a matter of chance, while at the same time we regarded 



G. A. Barnard 


127 


the lady’s guess as equally determinate. The 4 statistical population 9 enters into this expert 
ment only in the experimental technique, Via the randomization procedure used to fix the 
order of presentation of cups; it does not enter into the question being asked. In this case, 
the extreme result, in which in fact the milk was put in first every time, while the lady 
guessed every time that it was otherwise, would be taken as evidence against the lady’s 
claim. But su6h a result could by itself have no meaning for the question asked in the case 
of a restricted 2x2 trial or a doubly restricted double dichotomy. 

Further types of experimental procedure leading to results expressible in the form of 
Table 1 are the various sequential procedures that have been described for deciding questions 
of the kind we have been discussing (3, 4). Yet another procedure is one where the conditions 
of trial vary from one block of tests to another — as when an open-air trial runs over several 
days of inconstant weather. Here we might suppose there were k pairs of urns, (A v B x ), 
(A 2i B 2 ), ..., (A k , B k ). The distinctions here are, however, obvious enough, and they are 
worth noting only in order to emphasize that the mere fact there results are presented in the 
form of Table 1 is not in itself sufficient to specify an appropriate test of significance. 

Part II 

Ths significance test for the 2x2 trial 

Roughly speaking, the object of a significance test as applied to results of the type con- 
sidered, is to answer the question: Can these results be ascribed to ‘chance’? In this form, 
the question is not sufficiently precise. If our ‘urn model’ for the 2x2 comparative trial is 
adequate to represent the experiment actually carried out, then the results will in any case 
be ‘due to chance’, in some sense. What we wish to know in this case is whether a particular 
kind of chance — namely, one in which p a = p b = p — can be said to account for our results. 
If the results are such that this explanation of them is untenable, then we may conclude 
either, that our particular ‘ urn model ’ of the experiment is inadequate anyway ; or we may 
retain the model, and conclude that p f , and p b must be unequal. In most cases, of course, 
we shall reach the latter conclusion, since we would not have made up the urn model in 
question unless we had some reasons for believing in its adequacy; but it is well to bear in 
mind the first alternative, in case a re-examination of the circumstances may make us change 
our minds. A point very strongly emphasized bv Fisher in his book The Design of Experi- 
ments is, that we ought to have in mind a particular ‘urn model’ before the experiment is 
performed, and arrange the conduct of the experiment so that the adequacy of this urn 
model is not likely to be questioned afterwards. 

With the qualifications indicated, we can say that the object of the significance test we 
propose to develop is, to enable a particular class of explanations of our experimental results 
to be ruled out as untenable. Specifically, given results like those of Table 1 , we want to be 
able to say that they could not be accounted for by supposing that the experiment we 
actually performed was analogous to the urn experiment with two urns in which p a = = p . 
This raises the question, in what sense could such a supposition fail to account for the 
observed results ? Any result of the form of Table 1 could arise in an experiment of this kind, 
when our supposition is true. Why, then, should we select some results of this form and say 
they are incompatible with our supposition ? 

In the last analysis, this question cannot be answered without an examination of what is 
meant in general by statements involving probabilities, a point which is still the subject of 



128 


Significance tests for 2x2 tables 

controversy. But in our particular case (if not in all cases) we can avoid giving a general 
answer to the question of what probability is, by considering the practical circumstances 
which form the setting for our particular problem, and the uses to which we propose to put 
the answer. In fact, in our case we are interested in the equality or otherwise of p a and p b 
because we want to decide which of the two processes, A and B, is to be preferred, from the 
point of view of defectives produced. To say thatp 0 is greater thanp 6 will mean, for us, that 
process B is preferable, and conversely if p b is greater than p u , while to say that p a and p b 
are equal will mean that there is nothing to choose between the two processes. In fact, to 
say that p a = p b , in our case, means that, if process A and process B are both used, then it 
will be found that the frequencies with which defectives appear in the two processes will, 
for practical purposes, be equal.* Thus we shall assert that results in which the observed 
frequencies, a/m and bjn, differ widely, are incompatible with the supposition that p a = p b \ 
in doing so, we shall be neglecting as impossible a class of events which are in reality logically 
possible, but whose probability is small. The precise formulation of a test of significance 
then reduces to a precise formulation of what is meant by a ‘wide difference * in the fre- 
quencies a/m and b/n , and to an evaluation of the probability of those events which are being 
neglected as impossible. 

The lattice diagram 

If we consider the first problem, of arranging results like those of Table 1 in order of the 
relative ‘width ’ of the differences they indicate, a first step is the enumeration of all possible 
results in a convenient form. 

Logically, we should begin by noting that Table 1 is really an abbreviated version of the 
results of any one particular experiment, which will to start with be like those of Table 2 
(where we have taken m = 8, n = 6, for definiteness). 

Table 2 

Um: A A A A A A A A B B B B B B 

Mark: 11 1 II II I II 1 L 1 1 I l II I \ 1 

But if, as we are presupposing, our urn analogy is adequate to represent the conditions of 
the experiment, the order in which the results were obtained must be irrelevant to the 
interpretation of results. If the conditions of trial varied during the course of the experiment, 
this assumption might not be correct — for example, if the trial were an open-air trial, and 
it began to rain half-way through. We are assuming that the urn analogy is adequate, and 
so we must treat all results like Table 2 which give the same values to a, 6, c, d, in Table 1, 
as equivalent. Table 1 therefore stands for m\n\ja\b\c\d\ distinct, but equivalent, results 
which we shall not distinguish from now on. 

If we now take rectangular axes in a plane, we can represent Table 1 by the point whose 
coordinates are (a, 6). Thus ‘x’ in Fig. 1 represents the set of results equivalent, in the sense 
of the previous paragraph, to the results of Table 2. At the same time, all possible results 
of the experiment which gave rise to Table 2 are represented by the points of the rectangle 

* We hope that the qualifications we have attached to our statements will be sufficient to guard us 
against the accusation that we have adopted in full a ‘ frequency theory ’ of probability. The frequency 
interpretation is relevant to our particular problem ; other problems may involve other interpretations. 
More than one interpretation may be relevant in a single problem. 



G. A. Babnabd 


129 


PQR8 . W© call this representation of possible results the lattice diagram.* Our problem 
may now be regarded as one of ordering the points of the lattice diagram according to the 
‘width’ of the difference they indicate. 

Conditions 8 and C 

In trying to make the idea of ‘width’ of difference precise, we are up against difficulties 
similar to those attaching to the interpretation of results on the basis of incomplete infor- 
mation about the circumstances of the experiment. The information given at first was 
compatible with several distinct ‘urn models’. Similarly, the information given now is 
compatible with several different notions about ‘width ’ of difference. We may be concerned 
with the arithmetical size of the difference p a —p b , or with the ratio p a /p b , or with the 
logarithm of this ratio, or with some more complicated function. 

Logically, therefore, we should expect to set up various tests, based on various ideas of 
what constitutes ‘ width ’ of difference in probability or frequency (in Neyman and Pearson’s 


6 S 

5 

4 


R 


2 

1 

0 

P 


Q 

0 1 2 3 4 5 6 7 8 


Fig. 1 


Table 3 



1 

II 

Total 

A 

c 

a 

m 

Ii 

d 

b 

n 

Total 


r 

N 



_ ... 

_ 


language, corresponding to various weight functions over the space of alternatives to the 
hypothesis tested). But here a factor which may simply be described as laziness enters in. 
If we carried our ideas to their logical conclusion, we should find ourselves constructing a 
new test for almost every new experiment we had to deal with; and the time and effort 
involved in this are too great. Consequently, we confine our attempt to producing a test 
which will be reasonably applicable to a wide class of cases of the type specified, without 
suggesting that this test is unique, or ‘ best possible’. 

First, then, in our ordering of points in the lattice diagram, we propose that the same rank 
should be given to the point ((m — a), (n — b)) as to the point (a, b). This condition we propose 
to call the ‘symmetry condition’, or ‘condition 8\ It amounts to saying, that if Table 1 
is to be considered as indicating a real difference between j) a and p h , then so is Table 3, in 
which the labels ‘1’ and ‘II’ have been interchanged. If, when we are testing whether 

* Not the sample space of Noyrnan and Pearson. In the sample space, different results equivalent 
to Table 2 are represented by different points. 

Biometrika 34 


9 



130 


Significance tests for 2x2 tables 

Pa = we can say we are also testing whether 1 —p n = 1 —p b , from the same point of view, 
then this symmetry condition is clearly justified.* 

Next, we propose that in our ordering, the two points which, respectively, have the same 
abscissa or the same ordinate as (a, b ), and which lie further from the diagonal PR , shall be 
considered as indicating wider differences than (a, 6) itself. Thus, referring to Fig. 1, the 
points immediately above and immediately to the left of the point ‘ x ’ are reckoned to 
indicate wider differences than the point ‘ x ’ itself. This condition implies that the set of 
points indicating differences as wide or wider than (a, b) will have a shape property vaguely 
related to convexity, and we call it the ‘ C condition’. It means that if we consider the 
table corresponding to Table 2, with cell frequencies 

2 6 
5 1 

as significant evidence of difference, then we must also consider the tables 

1 7 and 2 6 

5 1 6 0 

as significant evidence of difference. It is difficult to imagine circumstances where this 
would not be so. 

Geometrically, condition 8 implies that we can in future restrict our considerations to 
points in the lattice diagram lying on or above the diagonal PR , i.e. in the triangle PRS. 
And condition C implies that, in this triangle, our ‘width of difference’ must increase as 
we go upwards or to the left. If horizontal and vertical axes are taken at any point A r in this 
triangle, points in the second quadrant are associated with a wider difference than X is, 
points in the fourth quadrant are associated with narrower differences than X is. The relative 
width of differences associated with points in the first and third quadrants (excluding the 
axes) are now determined by the conditions C and 8. The ordering generated by these 
conditions is thus a partial, not a total, ordering; it is, in fact, a kind of conical order, in the 
sense of A. A. Robb. We must introduce some further condition to make the ordering total. 


Probability consider ati ons 

In many simpler cases, it is possible to distinguish those events which are considered 
incompatible with a given probability hypothesis by their relatively low probability, 
compared with other possible events. Such a simple comparison of probabilities is not open 
to us in this case, because to each point (a, b) we have, on the hypothesis tested, associated 
a function tw 1 t 


which contains the ‘nuisance parameter’ p. If we consider the relative position, in our 
ordering, of another point, (a', 6'), we have to consider the inequality 

W(a } b; p) < W(a\b';jj) i (6) 

the truth or falsehood of which depends, in general, on the unknown p\ and there is nothing 
in the statement of the problem, nor in the experimental method, to justify any particular 
choice for the value of p . 


* Cases where is impossible are hereby neglected, strictly. 



G. A. Barnard 


131 


If (a 4- ft) =' (a' + ft'), the validity or otherwise of the inequalit}' (6) is independent of p . 
Thus, using this inequality as a criterion for ordering our points, we can say that in the 
triangle PRS, the 4 width of difference * must increase as we move north-west. But this is 
all that can be derived from this criterion, and it is clearly even less helpful in ordering the 
points than the conditions C and 8 are. Moreover, if we recall that each point (a, b) in the 
lattice diagram really represents a set of m ! n ! I a ! ft ! c ! d ! distinct results, each with probability 
p r (l—p)*> the criterion (6) loses its plausibility. 

We might try to improve the situation by associating the function W(a y b\p) with a 
number, depending on a and b only. For fixed a and ft, this number would be a functional 
of W(a> ft; p). We should clearly require that, if the inequality (6) is true for all p, then the 
corresponding inequality should be true of the numbers associated with W(a,b\p) and 
W(a',b'; p). The simplest functionals which satisfy this condition will be the mean value, 

w(a,b) — f W(a y b\p)dp , 

Jo 

the maximum value w'(a,b) = max IF (a, ft; p), 

0 < < 1 

and one single value w"(a, ft) = W (a, ft ; p 0 ). 

Circumstances could be imagined in which any of these three criteria might produce 
reasonable tests of significance. For example, in certain genetical experiments we may have 
reason to suppose that the value p = 1 /3 would occur more often than any other value. In 
such a case we might use iv", with p 0 — 1/3. But for general purposes taking p 0 = 1/3 could 
not be justified. 

We might again argue that taking w as our criterion would correspond with the assumption 
that all values ofp were a priori equally likely. But some would say that such an assumption 
was never justified; while those who would admit the assumption would in strictness do so 
only if we really did know nothing about the value of p. And in the general circumstances we 
are trying to cater for. we may sometimes know something vague about the value of p — 
such as, for example, that p will be less than i. 

Neyman and Pearson have shown that the likelihood ratio, which in our case comes to be 

m m n H r T * 8 
a n b b r c d d N x 

very often gives a good basis for ordering experimental results. We feel, however, that the 
criterion we shall describe in the next section has a slightly more direct justification than 
the likelihood ratio, though the choice, is, admittedly, largely a matter of taste. 


Th& maximum condition 

Before setting out the final condition which, with conditions S and C , will be used even- 
tually to arrange the points of the lattice diagram in order of ‘relative width of difference 
indicated’, we need to consider the assignment of significance levels to various results. 

When we say that a given result is not significant on, say, the 5 % level, we mean that 
such a result, or one indicating a wider difference, could occur, with probability at least 0*05, 
even when p a = p b . We could believe in a theory that p a = p h , without having to suppose 
that an event belonging to a class whose joint probability was less than 0*05 had occurred. 
Conversely, if a result is judged significant on the 5 % level, it means that no theory which 


9-2 



132 Significance tests for 2x2 tables 

assumed that p a = p b could account for the result obtained without supposing that an event 
of a type whose probability was less than 0*05 had occurred on the occasion in question. 

Let us now consider a specific case, in which we choose numbers which in practice would 
be ridiculously small in order to save arithmetic. Suppose, in fact, m = n = 2, while a = 2 
and 6 = 0. It follows from conditions S and C alone that in judging the significance of such 
a result we need consider only the probability of this result, together with its converse, in 
which a = 0, 6 = 2. If p a = p b = p, the probability of results of this type is 

P = 2p 2 (l-p) 2 . 

Now suppose that we are prepared to discard as untenable theories which require us to sup- 
pose that events of probability less than 0*05 had occurred. In such a case, we should discard 
a theory which supposed p a — p b — 0-1, since in this case P = 2(0-l) 2 (0-9) 2 = 0-01B2, less 
than 0*05. But we could not discard a theory whicli supposed p a = p h — 0-5, since in thiR 
case P = 0-125. In fact, our result would enable us to discard all theories involving^, — p^ — p, 
except those for which p lay in the interval 0- 197 <p < 0-803. In particular practical cases 
we might be prepared, on grounds external to the experiment in question, to dismiss the 
possibility that p should lie in this interval; and in such cases we should be entitled to say 
that the result excludes the possibility that p n = p h . 

It is easy to see that the above specific case is typical. Any set of points in the lattice 
diagram, considered by some criterion agreeing with conditions S and C to indicate differ- 
ences as wide or wider than those of a given result, will be associated with a probability P , 
on the assumption p n — p h = p ; and this P will be a function of />, rising from zero when 
p = 0 to a maximum in the neighbourhood of p — A, and then falling again symmetrically 
(by the S condition) to zero again at p = 1, somewhat as in Fig. 2. The given result by itself 



will exclude the possibility p a = p b altogether, only if the significance level adopted is greater 
than P nn the maximum value of P. If our significance level corresponds to a probability 
less than P m , then all we can say is, that our result is incompatible with p a — p b unless their 
common value lies in a certain subset of the range (0, 1). We may or may not exclude these 
latter possibilities on other grounds. 

In trying to construct our test, however, we have set ourselves the task of evaluating the 
evidence provided by our experiment alone in relation to the hypothesis^ = />*,. It now appears 
that this is impossible so long as we restrict ourselves to the form, usual in such cases, of a 
simple statement that a given result is, or is not, significant on a given level. We have two 
alternatives. Either we can find an entirely new form of statement to convey what we wish 
to express; or we can adhere to the form of statement, and try to make the situation fit the 
form as nearly as possible. Perhaps the day will come when experimenters do not require 
answers in the form of numbers, when they are sufficiently versed in generalized mathe- 
matical analysis to be content with a function (such as the function P(p)), instead of a single 



G. A. Barnard 


133 


number. But we have not yet reached this stage; and so we propose to take up the latter 
alternative, and try to make the situation fit the standard form of statement of significance 
tests as nearly as possible.* 

Our difficulty arises from the dependence of P on p . If the graph of P against p were a 
horizontal straight line, our difficulty would be overcome. What we propose, therefore, is 
to try to make the graph of P against p as near to a horizontal line as possible, by suitably 
adapting our idea of what is meant by ‘width of difference’. In making this adaptation, we 
shall secure that we do not violate the common-sense requirements as to the meaning of the 
term ‘width of difference’, by requiring that conditions C and 8 should always be satisfied. 

The maximum condition 

The condition G requires that, of all points in the triangle PRS, that indicating the 
‘ widest difference ’ must be the point 8 at the corner (Fig. 1 ). The function P associated with 
this point and its converse, Q y which we may denote as P(0, 6; p) 9 is 

P( 0,6; p) ~ p 6 (l -p) 8 +p 8 (l -p) 6 
and the maximum P m occurs here when p = i, where we have 

P m (0,6) = 1/2 13 = 1*22 x 1()~ 4 . 

The condition V requires that the only points which might be considered as coming next 
after S, in order of decreasing ‘width of difference’ are (1,6) and (0, 5). We have to adopt 
some principle to choose between these two. 

If (1, 6) were taken next after (0, 6), the function P associated with it would be 

P\ 1 , 6; p) = P( 0, 6; p) + 16p 7 (l -p) 7 

and P', n (\, 6) would come to 9/ 2 13 = 10*97 x 10 4 . On the other hand, if (0,5) were chosen 
next, instead of (1,6), we should have 

P( 0,6; p) -= P( 0,6; p) + 6fp tt (l -p) 5 -f ^ 5 (1 -pf] 

and PjiK 5) would come to 8*58 x ID' 4 , the maximum occurring when p = 4 ± A -^(6/70). 
Thus Pji) t 5) is smaller than /^(l, 6), and this lower maximum is associated with a flatter 
curve of P( 0,5; p). Since a flat curve is our aim (the horizontal line being the ideal), we 
choose (0, 5) as the point to come next after (0, 6), rather than (1,6). 

Having chosen (0, 5) as the next ‘ widest difference ’ point, the C condition restricts us to 
the points (1,6), and (0, 4), as candidates for the next position. We consequently compare 

P(l,6;p) = P(0,5;/>)+ 16p 7 (l-p) 7 
with P”( 0,4; p) = P(0, 5; /))+ 15[p 4 (l — p) 10 +p 10 (l — p) 4 ] 

and the lower value of P w as criterion shows that ( 1 , 6) is now to be taken. At the next stage, 
we shall have to compare the functions associated with (0, 4), (1, 5) and (2, 6). In this way 
we can arrange the points of the lattice diagram in order, step by step. 

The principle involved, which we call the ‘maximum condition’, may be formally stated 
as follows: 

Considering only points for which ajm is less than b/n, if the first (w — 1) points (a v b t ), 
•••> ( a n- i>^n-i)> of<ler of decreasing ‘width of difference’ have been chosen, and 

* In the example just taken we might make a kind of ‘conditional confidence interval statement*, 
that, if p existed, we should have 0*197 <p<0- 803 with confidence coefficient 0*95. 



134 


Significance tests for 2x2 tables 

(a n _ vK-i) is associated with the function P(a n .. v b n _ t ; p), then the nth point, (a n ,b n ) is 
that point, of all points (a, b) permitted by the C condition, for which , 

P m ( a > h ) = o “aic [p(o„_j , + (P r ( l -p)*+P*(l 

is least. (a n , b n ) is then associated with the function 

PK>b n ;p) = P{a n _ x ,b n _ 1 \p) + J^^^[p'{\-p)°+p>{\-pYl 

To complete the specification of the ordering, we have to legislate for the case where there 
are several points giving the same value of P m (a, b ), this value being less than that associated 
with any other permissible point. In this case we lay down that all such points are to be 
given the same rank, and the second term in the expression for P(a n , b n ; p) is to be replaced 
by the corresponding sum over all these points. If there are k such points at any stage, then 
the next point after them will be denoted as the (n + &)th point in the ordering. This requires, 
for example, when m = n, that the points (a, b) and (6, a) are always to be taken together. 

Finally, the significance level to be attached to the point (a n ,b n ) will be 

P m K,b„) = max P(a n , b n ; p). 

0<p< 1 

This guarantees that our test will be a ‘valid’ one, in the sense that, if we judge a result 
incompatible with the hypothesis p a = p b , on a given level of significance, then all the 
possibilities of the form p a = p b are excluded, to the given level. Thus no further information, 
external to the experiment in question, could make us decide that a result judged significant 
by our test was not in fact so (holding, of course, to a fixed significance level); on the other 
hand, we still have the possibility that other information may lead us to consider as signi- 
ficant results which appear in themselves not to be so. The formulation of our maximum 
condition is made so as to minimize this latter possibility. Our test is thus conservative, in 
the sense that we do not draw the conclusion p n =t p b unless this is certainly warranted by 
the data; but it might be called ‘ progressive conservative because, of all such conservative 
tests, it will be the least conservative. 


Another aspect of the maximum condition 


When the author first approaohed the problem of analysis of experimental results of the 
type now considered, he did so from the point of view of regarding the significance level to 
be used as being fixed in advance, say at the 5 % level. From this point of view, the problem 
of constructing a test resolved itself, not into one of ordering the points in the lattice diagram, 
but into one of choosing a region, or set of points in the lattice diagram, such that any point 
belonging to this region could be regarded as evidence of inequality of p (l and p b , on the given 
level of significance. The condition of symmetry required that such a region should consist 
of two similar parts, one above the diagonal PR , and one below it. The condition C required 
that the part of the region lying above the diagonal PR should be so shaped that if a p6int 
X belonged to the region, then so would all points lying north or west of X. There remained 
the problem, to decide which of the many regions satisfying these two conditions should be 
the one adopted. 

To settle this, to any such region R we can associate a function 


P(R;p) 


= 2 


ml n\ 


(a,e»«2*0 




ft 



G. A. Bajrnabd 


135 


and such a region will give a ‘valid’ test of significance provided that 

Max P(R;p)^ 0-05. 

0<jp<l 

There will not be so many regions satisfying this validity condition as well as the conditions 
8 and C. We proposed, therefore, to select that region from among these, which had the 
greatest number of points in it. This last condition was what we then called the 'maximum 
condition’. The fact that this region would not be unique in cases where m = n was taken 
care of by requiring a subsidiary symmetry condition that in such cases (a, 6) and (6, a) 
should always be taken together. 

What we have now adopted as the 4 maximum condition ’ can be seen to be related to this 
earlier version, by the consideration that, roughly speaking, apart from effects due to the 
discreteness of the lattice diagram, holding the number of points in the region constant, and 
then choosing the region which gives the lowest value for i° m , as we do now, comes to the same 
thing as holding P m constant, and then choosing the region to have the maximum number 
of points. 

Other things being equal, the ‘power’ of a test, in the sense of Neyman and Pearson, will 
increase with the 4 volume ’ of the rejection region chosen. In this sense we can say, roughly, 
that the maximum condition secures that our test should be as powerful as possible, con- 
sistent with validity. 

Practical formulation of the test 

Some statistical tests (such as that due to Fisher, already mentioned), can be carried out 
in the form of a direct calculation from the data, without reference to any special tables. 
Most other tests require the use of special tables which, however, are for the most part tables 
of single or double entry, perhaps triple entry, if the level of significance is regarded as a 
variable. In our case, regarding the level of significance as a variable, a table of quadruple 
entry would be required. 

Ideally, a set of tables, one for each pair of values of m and n (w ^ n) would be required. 
The table would be in the shape of a right-angled triangle, corresponding to the triangle 
PR8 of Fig. 1 , and divided into squares, each square corresponding to given values of a and 6. 
Within each square (a, b) would then appear a number, the value of P m (a,b). This value of 
P m (a,b) then is the maximum probability of obtaining the result (a, 6), or one indicating a 
wider difference, if p (t = p b . A comparison of P w< (a, b) with the significance level adopted will 
then decide the significance or otherwise of our result. In any particular case we shall be able 
to see which tables, in the sense of our test, are regarded as indicating a wider difference, by 
noting which points are associated with lower values of P m (a f b ). 

In practice, it will be impossible to construct such tables for a large range of values of m 
and n. But for larger values of m and n, a test based on a normal approximation to the dis- 
tributions involved will be quite adequate for practical purposes. In fact, the test we have 
proposed will itself approximate, in some sense, to a test based on the normal distribution, 
though we do not enter into a detailed discussion of the relationship between the two tests 
here.* Tables are thus required for our test only for small values of m and n. In spite of 
advice by statisticians to the contrary, such small values of m and n continue to occur 
frequently in practice. . 

* The general question of the House in which tests are regarded as ‘ asymptotically approaching’ 
normal tests is a subject for another paper. Professor Pearson’s paper which follows, bears on this point. 



136 Significance tests for 2x2 tables 

In the Appendix we give specimen tables for the cases where N = 14. The comparative 
figures for the Fisher test, also given in the Appendix, indicate that the differences between 
the two tests are appreciable. An exploration is now under way into larger values of m and n, 
and it is hoped to report on this in due course. 


Other applications of the C.&.M . procedure 

We have spoken of our test as the C.S.M. test, as if the case dealt with above were the only 

ease to which the procedure adopted was applicable. But similar methods could be used 

in many other cases. In particular, a method closely following the one we have used might 

be applied to the case we have called the double dichotomy, which differs from the 2x2 

comparative trial in that two ‘nuisance parameters’, p and p' are present, instead of only 

one. The 2-diinensional lattice diagram of the 2x2 trial is replaced by a 3-dimensional 

regular tetrahedron of points with homogeneous coordinates (a, 6,c,d), connected by the 

relation - , , , 

a-f0 + c-f-d = A. 


Two opposite edges of this tetrahedron correspond to m = 0 and n = 0, and sections of the 
tetrahedron by planes parallel to these edges will look exactly like lattice diagrams for the 
2x2 case and within these sections, relative probabilities will behave just as in the 2x2 case. 
An examination of the possibilities, however, indicates that not much is to be gained by a 
detailed treatment. The C.S.M. test for 2 x 2 comparative trials will be a valid test if applied 
to double dichotomies. It will err somewhat on the side of ‘conservatism’, but the error 
does not appear to be large, except when the numbers involved are exceedingly small. 

It is with a view to further applications of the approach used in this paper that we have 
retained the V condition as a separate requirement, although it is easy to see that it could 
be absorbed into the M condition as we have given it. 


In writing this paper the author lias had great personal help and encouragement from 
Prof. E. S. Pearson, to whom he wishes to express his very deep thanks. 


Summary 

In Part I we discuss various types of experiment, each of which may give rise to results 
in the form of a 2 x 2 table. It appears that significance tests which may be appropriate 
for one type of experiment will not necessarily be appropriate for another. 

In Part II a test is developed for experiments of the type called ‘2x2 comparative 
trials'. 


APPENDIX 
Tables for the CSM test 

Three tables are given below to illustrate the application of the ideas given in the main paper 
to the construction of a test for 2x2 comparative trials. The cases covered are pairs of 
samples, sizes (7, 7), (8, 6), and (9, 5). The small figures in brackets in the (7, 7) table gives 
significance levels on Fisher’s ‘exact’ test for 2x2 independence trials, for comparison. 
Only half of the (8, 8) and (9, 5) tables are given; the missing parts can be filled in by 
symmetry. The following examples show the meaning and use of the tables: 



G. A. Barnard 


1S7 


Example 1. Two boxes, each containing a large number of components, are to be tested 
for comparative quality measured by the respective proportions of defective components 
they contain. Two samples, each of seven components, are taken, at random, one from each 
box. One sample gives four defectives, the other, none. What is the significance of this result, 
in relation to the hypothesis that the boxes have the same quality ? 

Answer . Entering the (7, 7) table at the point (0, 4), we find the number 2*4. This means 
that the result is evidence against the hypothesis, on the 2-4 % level of significance. 





liable for m = 

II 




7 

0012 

0-18 

0-70 

2*4 

7-5 

20 





( 0058 ) 

( 023 ) 

( 2 - 1 ) 

(7 0 ) 

( 19 ) 

( 46 ) 



6 

0*18 

1-3 

5-7 

13 


— 

— 

— 


( 0 * 23 ) 

( 2 * 9 ) 

( 10 ) 

( 27 ) 





5 

0*70 

5-7 

21 


— 

— 

— 

20 


( 21 ) 

( 10 ) 

( 20 ) 





( 46 ) 

4 

2-4 

13 

— 

— 

— 

— 


7*5 


( 70 ) 

( 27 ) 






( 19 ) 

3 

7*5 


— 

— 

— 

— 

13 

2*4 


( 10 ) 






( 27 ) 

( 7 * 0 ) 

2 

20 

— 


— 

— 

21 

5*7 

0-70 


( 46 ) 





( 29 ) 

( 10 ) 

( 2 * 1 ) 

1 


— 

— 

— 

13 

5*7 

1*3 

0*18 






( 27 ) 

( 10 ) 

( 2 * 9 ) 

( 0 * 23 ) 

0 

— 

— 

20 

7*5 

2-4 

0*70 

0*18 

0*012 




( 40 ) 

( 19 ) 

( 70 ) 

( 21 ) 

( 0 * 23 ) 

( 0 * 058 ) 


0 

1 

2 

3 

4 

6 

0 

7 


More precisely, what is asserted is, that the maximum probability of getting a result not 
less significant than that obtained, is 0*024. And the results which are not less significant 
are those which correspond to points in the table with numbers not greater than 2*4, viz. 
(0, 4), (7, 3), (0, 5), (7, 2), (0, fi), (7, 1), (0, 7), (7, 0), (1. 6), (0, 1), (1, 7), (6, 0), (2, 7), (5, 0), 
(3, 7), (4, O). By suitable choice of the proportion defective, we could construct a pair of 
boxes, of equal quality, which would give samples falling in this group 24 times out of 1000, 
on the average ; but we could not, by any choice of proportion defective, retain equal quality 
and yet have results in this group more often than 24 times in 1000. 


Table for m = 8, n = 6 



0 

1 

2 

3 

4 

5 

6 

7 

8 



6 

0012 

018 

0*71 

2*5 

5*3 

13 



— 

— 



5 

0*085 

1*3 

6*0 

11 



— 

— 

— 

— 

5 

4 

0*44 

3*9 

19 

--- 

— 

— 

— 


— 

— 

4 

3 

1*0 

10 


-- 

— 

— - 


— 

— 

0*3 

3 

2 

8*0 



— 

— 

— 


20 

3*8 

0*86 

2 

1 

23 


— 

— 

- - 


14 

7*4 

1*3 

01 3 

1 

0 

*- 

— 

— 

16 

10 

5*3 

2*3 

0*62 

0*19 

0*012 

0 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 



Table for m = 9, n = 5 

Example 2. The situation is as before, except that the first sample has nine components, 
none of them defective, while the second sample has five components, four of them defective. 



138 Significance tests for 2x2 tables 

Answer. Here, to use the table as given, we have to compare numbers effective, rather 
than numbers defective — viz. we consider the pair (9, 1) rather than (0, 4). Entering the 
(9, 5) table at (9, 1) we find 0-13. The result is evidence against the hypothesis of equal 
quality, on the 0*13 % level of significance. 

Thanks are due to Miss Lang, who has checked the^computations. 

REFERENCES 

Fisher, R. A. (1941). Statistical Methods for Research Workers, 8th ed. Edinburgh: Oliver and Boyd. 
Fisher, R. A. (1942). The Design of Experiments, oh. u. Edinburgh: Oliver and Boyd. 



t 139 ] 


THE CHOICE OF STATISTICAL TESTS ILLUSTRATED ON THE 
INTERPRETATION OF DATA CLASSED IN A 2 x 2 TABLE 

By E. S. PEARSON 


CONTENTS 

PAGE 

(i) Introductory 139 

(ii) The choice of statistical tests . . . . . . . . .142 

(iii) Application of this approach to the analysis of data classed in a 2 x 2 table . 144 

(iv) Problem 1 144 

(v) Problem II 147 

(vi) Solution of Problem II, using the normal approximation . . . 151 

(vii) The classical approach to Problem II . . . . .157 

(viii) Problem III ............ 158 

(ix) General comment . . . . . . . . . . .160 

References 163 

Appendix 164 


(i) Introductory 

1 . The problem of testing the significance of a difference between two proportions is one 
which receives early attention in text-books on mathematical statistics, and it might be 
thought to be one of the questions whose final solution lies behind us. It is a problem whose 
simplicity makes it easy to examine the logical cogency of the methods put forward for its 
solution, but, on examination, it is evident that they have not yet been rounded off satis- 
factorily. The origin of the present paper lies partly in an investigation commenced in 1938 
and discussed at the time in College lectures, and partly in recent correspondence in Nature 
in which G. A. Barnard (1945a, b) and R. A. Fisher (1945a) have taken part.* This 
correspondence has suggested that in a problem of such apparent simplicity, starting from 
different premises, it is possible to reach what may sometimes be very different numerical 
probability figures by which to judge significance. 

2. Such a difference in levels of significance in the solution of an everyday problem is 
obviously puzzling to the users of statistical methods who are accustomed to accept the 
technique as an established procedure and have not the opportunity for a critical examina- 
tion of the conditions under which probability theory is brought to bear as a guide to action. 
For the question here at issue is a fundamental one of why and how our judgement is in- 
fluenced by the calculation of a probability, and the dilemma raised by the Bamard-Fisher 
correspondence can only be answered in terms of our views on the practical function of the 
theory. We may all agree that in practice we use probability figures derived from an analysis 
of numerical data to help us to make up our minds on the next step, whether in experi- 
mental research or executive action. But what form of presentation of the probability set-up 
is likely to result in the greater number of sound decisions is likely to be always a matter for 
differences of opinion. 

3. All that I can do is to approach the problem of the 2x2 table from the viewpoint 
which appears most helpful to me. In the preceding paper Mr Barnard has elaborated the 

* There was also an earlier discussion on the same subject between E. B. Wilson (1941, 1942) and 
K. A. Fisher (1941). 



140 


Choice of statistical tests 

views expressed in his letters to Nature . Such discussion is, I believe, desirable, even though 
controversial issues are raised. For the value of the whole elaborate structure of the 
modern theory of mathematical statistics depends at least in part on the sense in which the 
individual statistician appreciates the meaning of the probability model he is using when 
drawing the practical conclusions from his analysis of data. I have used the words ‘in part’, 
for it is true that the analytical process of applying the statistical technique to experi- 
mental data may in itself be enormously illuminating even without paying any close regard 
to a final probability figure. Such is the case, for example, with the technique of analysis of 
variance, where the mere process of breaking up a total sum of squares into parts with which 
different sources of variability can be associated, brings with it a reward in clear thinking 
even without the application of a probability test. 

4. There is a very wide variety in the types of situation in which probability theory is 
introduced to help in reaching a decision as to further action. * 

(A) At one extreme we have the case where repeated decisions must be made on results 
obtained from some routine procedure carried out under controlled conditions. 

(B) At the other is the situation where statistical tools are applied to an isolated investiga- 
tion of considerable importance in which many of the issues involved in the conclusion can 
hardly be assessed in numerical terms. 

5. Two situations of this kind, in which the statistical technique involved is that of testing 
the significance of a difference between two proportions, may be illustrated from problems 
arising in the ‘proof’ of armour-piercing shot or shell. 

6. Eocample of type A . In the proof of small anti-tank, armour-piercing shot it might be 
decided to set aside, as a standard, a batch of shot whose quality has been established by 
special trials; against this standard, later batches can be compared. The variable measured 
is the proportion of shot which fail to perforate a plate of specified thickness when fired with 
a given striking velocity. The use of standard shot is necessary for calibration purposes, 
because there are inevitable changes in toughness from one proof plate to another and only 
a limited number of shot can be fired at a single plat/e. Then the situation might be summed 
up as follows:* 

Aim of proof. To ensure that as few batches as possible are passed into service which 
are less effective than the standard. 

Method of proof. Twelve rounds of the standard and twelve of the batch under tost to be 
fired, round for round, against a single test plate and a record kept of the number of failures 
in each group, say a and 6. 

Routine sentencing rule . This should lay down a ready means of determining, from a 
knowledge of a and b, whether to class the new batch as inferior to the standard or not. 

Assumptions accepted in using rule. That the two samples of twelve shot have each been 
randomly selected from the much larger batches. That against the particular plate used, a 
proportion p x of the standard and p 2 of the new batch would fail to give satisfactory per- 
foration at the specified striking velocity. That while p x and p 2 would be different for other 
plates, if p 2 >p x for one plate, it will be so for all other plates. The objective is to segregate 
batches of shot for which p 2 >p v 

* It has been somewhat simplified for illustrative purposes, e.g. complete control of the striking 
velocity is not in practice possible. 



E. S. Peabson 


141 


7. Example of type 5. Two types of heavy armour-piercing naval shell of the same 
calibre are under consideration; they may be of different design or made by different firms. 
Since the cost of producing and testing a single round of this kind runs into many hundreds 
of pounds, the investigation is a costly one, yet the issues involved are far reaching. Twelve 
shells of one kind and eight of the other have been fired; two of the former and five of the 
latter failed to perforate the plate. In what way can a statistical test contribute to the 
decision which must be taken on further action? 

8. In dealing with Example A the guiding principle followed in seeking help from the 
theory of probability can be very simple. We can set as our object a rule which: 

(i) will result in an increasing chance of detecting that p 2 > p v the larger the difference; 

(ii) will leave only a small chance of segregating the new batch wrongly when, in fact, p 2 < p v 

Diagrammatically the rule would consist in segregating the new batch when the point (a, b) 

falls within some such area as that shown shaded in 
Fig. 1. In this problem involving a routine pro- 
cedure, it is the long-run frequency of different con- 
sequences of the proof sentencing which is of 
importance, and probability theory is introduced to 
provide a measure of expected frequency. This 
method of introducing the theory of probability into 
this proof problem is not necessarily the only one b 
that could be adopted in fixing a routine procedure, 
but it is a simple one and, since simplicity has the 
merit of appealing to the user’s understanding, it has 
great advantages. 

9. When dealing with Example B a very con- 
siderable number of factors must be weighed in 
the balance, and the result of a statistical test of 
significance could never be the over-riding one. 

There will be other information as to the effect of changes in shell design, possibly from 
shell of different calibre; information as to the uniformity in quality of output of the 
firm or firms concerned ; questions of cost and of general policy. He would be a bold man who 
would attempt to express these in numerical terms. Whereas when tackling problem A 
it is easy to convince the practical man of the value of a probability construct related to 
frequency of occurrence, in problem B the argument that ‘ if we were to repeatedly do so 
and so, such and such result would follow in the long run’ is at once met by the common- 
sense answer that we never should carry out a precisely similar trial again. 

10. Nevertheless, it is clear that the scientist with a knowledge of statistical method 
behind him can make his contribution to a round-table discussion, provided he has 
acquired a grasp of the practical issues. Starting from the basis that individual shell will 
never be identical in armour-piercing qualities, however good the control of production, 
he has to consider how much of the difference between (i) two failures out of twelve and 
(ii) five failures out of eight is likely to be due to this inevitable variability. There may be a 
number of ways of sizing up the position involving different assumptions or hypothetical 
constructs; be may follow one or several of these. The value of his advice is dependent almost 



Fig. 1 




142 Choice of statistical tests 

entirely on the soundness of his scientific judgement, and very little on whether his back- 
room calculations have been based on inverse or direct probability or on an appeal to 
fiducial argument. 

1 1 . How far, then, can one go in giving precision to a philosophy of statistical inference ? 
It seems clear that in certain problems probability theory is of value because of its close 
relation to frequency of occurrence; such seems to be the case for my Example A. Tests can 
be built up to satisfy the practical requirements in this field. In other and, no doubt, more 
numerous cases there is no repetition of the same type of trial or experiment, but all the 
same we can and many of us do use the same test rules to guide our decision, following the 
analysis of an isolated set of numerical data. Why do we do this? What are the springs of 
decision? Is it because the formulation of the case in terms of hypothetical repetition helps 
to that clarity of view needed for sound judgement? Or is it because we are content that the 
application of a rule, now in this investigation, now in that, should result in a long-run 
frequency of errors in judgement which we control at a low figure ? On this I should not care 
to dogmatize, realizing how difficult it is to analyse the reasons governing even one’s own 
personal decisions. 

12. That the frequency concept is not generally accepted in the interpretation of statis- 

tical tests is of course well known. With his characteristic forcefulness R. A. Fisher (1945fc) 
has recently written: ‘In recent times one often repeated exposition of the tests of signi- 
ficance, by J. Neyman, a writer not closely associated with the development of these tests, 
seems liable to lead mathematical readers astray, through laying down axiomatically, what 
is not agreed or generally true, that the level of significance must be equal to the frequency 
with which the hypothesis is rejected in repeated sampling of any fixed population allowed 
by hypothesis. This intrusive axiom, which is foreign to the reasoning on which the tests of 
significance were in fact based seems to be a real bar to progress ’ 

13. But the subject of criticism seems to me less an intrusive mathematical axiom than 
a mathematical formulation of a practical requirement which statisticians of many schools 
of thought have deliberately advanced. Prof. Fisher’s contributions to the development of 
tests of significance have been outstanding, but such tests, if under another name, were 
discovered before his day and are being derived far and wide to meet new needs. To claim 
what seems to amount to patent rights over their interpretation can hardly be his serious 
intention. Many of us, as statisticians, fall into the all too easy habit of making authoritative 
statements as to how probability theory should be used as a guide to judgement, but 
ultimately it is likely that the method of application which finds greatest favour will be that 
which through its simplicity and directness appeals most to the common scientific user’s 
understanding. Hitherto the user has been accustomed to accept the function of probability 
theory laid down by the mathematicians; but it would be good if he could take a larger share 
in formulating himself what are the practical requirements that the theory should satisfy 
in application. 


(ii) The choice of statistical tests 

14. One approach to follow in determining tests to be applied to the 2x2 class of problem 
follows the lines that Neyman and I have adopted since 1928 in dealing with tests of statis- 
tical hypotheses. Let me first recapitulate in broad terms the steps in that approach when 
applied to a problem where the universe of possible observations can be represented by a 



E. S. Pbaesok 


143 


finite set of discrete points. A test of significance may be described as a method of analysis 
of statistical data which helps us to discriminate between alternative theories or hypotheses. 
In order to make use of the theory of probability in the sense here understood, a random 
process must either have been purposely introduced or be assumed to have been present in 
the collection of data; then the hypothesis very often concerns the values of parameters 
contained in the probability laws which, in the conceptual sphere, form the mathematical 
counterpart of the sampling distributions of experience. 

15. We proceed by setting up a specific hypothesis to test, H 0 in Neyman’s and my 
terminology, the null hypothesis in R. A. Fisher’s. At the same time, in choosing the test, we 
take into account alternatives to H 0 which we believe possible or at any rate consider it 
most important to be on the look out for. Thus we wish the test to have maximum dis- 
criminating power within a certain class of hypotheses. Three steps in constructing the test 
may be defined : 

Step 1. We must first specify the set of results which could follow on repeated application 
of the random process used in the collection of the data; this may be termed the experi- 
mental probability set. 

Step 2. We then divide this set by a system of ordered boundaries or contours such that 
as we pass across one boundary and proceed to the next, we come to a class of results which 
makes us more and more inclined, on the information available, to reject the hypothesis 
tested in favour of alternatives which differ from it by increasing amounts. 

Step 3. We then, if possible, associate with each contour level the chance that, if H 0 is 
true, a result will occur in random sampling lying beyond that level. 

This rather crude statement of procedure will be developed in more detail in discussing 
the problems that arise in connexion with the 2x2 table. 

16. Notes on these points . (a) Step 1. This involves the definition of what Neyman and 
I have termed the sample space, IF. The application in three forms of the 2x2 problem is 
discussed in paragraphs 19, 27 and 46 below. 

(b) Step 2. For a given hypothesis under test there may be a number of ways of deriving 
a system of contours, and only in certain cases can there be said to be complete agreement 
on which is the ‘ best \ Practical expediency will often carry weight in the choice. It is widely 
accepted that the choice cannot be made without paying regard to the admissible hypotheses 
alternative to // w , whether this process is given formal precision or taken as a broad guide. 
In our first papers (Neyman & Pearson, 1928a, b) we suggested that the likelihood ratio 
criterion, A, was a very useful one to employ in determining a family of contours which 
would be ordered in relation to our confidence in the hypothesis tested when set against 
the background of admissible alternatives. Thus Step 2 preceded Step 3. In later papers 
(Neyman & Pearson, 1933, 1936 and 1938) we started with a fixed value for the chance, e, 
of Step 3 and determined the associated contour, taking account of what we termed the 
power of a test with regard to the alternative hypotheses. The family of Step 2 followed 
on giving decreasing values to e. However, although the mathematical procedure may 
put Step 3 before 2, we cannot put this into operation before we have decided, under 
Step 2, on the guiding principle to be used in choosing the contour system. That is why 
I have numbered the steps in this order. 

(c) Step 3. If this can be accomplished, we have what Neyman and I called control of the 
‘ 1st kind of error ’ . In problems where, as below, we are concerned with discrete rather than 



144 


Choice of statistical tests 

continuous probability distributions (e.g. for the binomial, the Poisson, the multinomial 
and the hypergeometric distributions), this objective cannot always be achieved, and it 
may be necessary to be satisfied with a knowledge of an upper limit of the chance of rejecting 
the hypothesis tested when it is true. 

(iii) Application of this approach to the analysis of data classed in a 2 x 2 table 

17. The frequencies of the data in the table may be defined in the following notation: 


Table 1 



Col. 1 

Col. 2 

Total 

Row 1 

a 

c 

m 

Row 2 

b 

d 

n 

Total 

r 

• 

a 

N 


If we follow in turn the steps defined above to determine the method of interpretation of 
such data, the requirements of the appropriate tests are seen to follow very simply, although 
mathematical or computational difficulties arise in implementing them. On taking Step 1 
we can separate out at once the three types of problem which Barnard has differentiated;* 
these I shall call Problems I, II and III. The} are distinguished by the sample space having 
1 , 2 and 3 dimensions respectively. From the mathematical point of view it might seem more 
logical to take them in the reverse order, adding first one and then a second restriction to 
the 3-dimensioned case of Problem III. For a simple exposition, I think the reverse procedure 
of building up from I to III is preferable and this has been adopted in the following sections. 

(iv) Problem I 

18. This may be described as the test of the significance of the difference between two 
treatments after these have been randomly assigned to a group of N — m + n individuals 
(Barnard terms it the 2x2 independence trial). To use the terminology of a particular 
application, we may say that we are observing the presence or absence of ‘reaction A"’. 
The first treatment is applied to m and the second to n of the N individuals; as a result a/m 
and bjn show reaction X. 

19. In this case the random process has been applied within the group of N individuals, 
and its repetition would simply involve other random reassignments of the two treatments 
among the N . No assumption is made as to how the N individuals were selected from some 
larger universe. The repetition may be hypothetical, in the sense that it often could not 
take place, e.g. if reaction X = death. Indeed, repetition under the same essential conditions 
is frequently impossible in practice. But this correspondence between the frequency of 
results upon hypothetical repetition and the probability distribution of the counterpart 
mathematical model forms an accepted part of the process of reasoning whereby (following 

* Statisticians had, of course, all been more or less conscious of these differences, but, at any rate 
in my own case, it was discussion with Mr Barnard which made it easy to see the problem in its full 
clarity. 



E. S. Pearson 


145 


the present approach) we use probability theory as a basis for inference. The hypothesis 
tested is that while some individuals show reaction X and some do not, the result would be 
the same whichever treatment were applied as far aa these N individuals are concerned. Thus, 
on the null hypothesis, there are r = a + b individuals who will react and s = c + d who will 
not, whatever the assignment of treatments. 


20. The chance that a will react in m and 6 = r — a in n is, therefore, if the hypothesis 
be true, 

iv i m\n\r\s\ ... 


This expression is proportional to the coefficient of in the hypergeometric series 

F(oc f fi,y,x) = F(—r, -m.n-r - +•!,*). 


( 2 ) 


Thus, taking m ^ n t a can assume values of 

(i) 0, 1, r if r^n, 

(ii) r — w, r — n+1, ..., r if n<r^m f 

(iii) r — w, r — 1, m if r>m. 

For this probability distribution, it is known (K. Pearson (1899) and Kendall (1943, p. 127)) 



that Meana= jy, (3) 

Variance of a = <r\ = • (4) 

21. For the particular case 

A r = 20, r = 7, m = 12, n « 8, 

the terms in the distribution of P r { a | 20, 7, 12} are shown as ordinates in Fig. 2 and given in 
the accompanying Table 2. The experimental probability set consists of the eight alter- 
native values for a, viz. 0, 1, ..., 7 with which the probabilities tabled are associated if H 0 is 
true. Further 


Biometrika 34 


Mean a = a — 4-2, <r a - 1-0721. 


IO 


( 5 ) 



146 Choice of statistical tests 

22. Next consider step 2. The purpose of the investigation is to test the hypothesis 
that the difference between a/12 and (r-a)/8 has resulted simply from a random 
partition of 20 individuals, of whom r will show reaction X in whichever treatment 
group they are included. The experiment gives r = 7. The contour levels fall between 
the 8 points of the set as shown in Fig. 2; the further a lies towards the right, the more 
inclined we shall be to accept the alternative hypothesis that a/12>(r — a)/8 because 
treatment 1 is more effective than treatment 2. The further a lies to the left, the more 
we shall incline towards the reverse alternative. To complete Step 3, we have only to 
calculate the sums of the tail terms of the hypergeometric series, as shown in Table 2 for 
the special case. 


Table 2. Problem /. Chances for special case N = 20, r = 7, m = 12, if H 0 is true 


d 

Chance 

Chance of a or less 


of a 





True value 

. 

Normal approx. 

0 

0*0001 

0-000 

0-000 

1 

0*0043 

0-004 

0*006 

2 

0*0477 

0-052 

0-056 

3 

0*1987 

0-25i 

0*257 

4 

0*3576 

— 

— 



Chance of 

a or more 



True value 

Normal approx. 

5 

0*2861 

0*392 

0*390 

6 

0*0954 

0*106 

0*113 

7 

0*0102 

0*010 

0*016 


23. Having set up the machinery of the test, we come to the practical question. Beyond 
which contour levels must a fall before we infer that there is a treatment difference? Not, 
I think, in the example, if a were 3, 4 or 5 ; possibly if a = 6, more probably if a = 2 and almost 
certainly if a = 0, 1 or 7. Were we to fix as critical levels those between a = 1 and 2 on the 
one hand, and between a = 6 and 7 on the other, then we should be guided in our decision 
by the following knowledge: if there were no treatment difference, so that seven out of the 
twenty individuals would have shown reaction X whichever treatment were applied, then 
the chance under random assignment of treatments that a < 2 or > 6 is only 0*014 or 1 in 70. 
Had we taken the critical levels between 2 and 3 and between 6 and 7| the corresponding 
chance would be 0*062 or 1 in 16. This summing up in terms of probability helps towards 
the balanced decision on the next practical step to be taken, because it helps us to assess the 
extent of purely chance fluctuations that are possible. It may be assumed that in a matter 
of importance we should never be content with a single experiment applied to twenty in- 
dividuals; but the result of applying the statistical test with its answer in terms of the chance 
of a mistaken conclusion if a certain rule of inference were followed, will help to determine 



E. S. Peabson 147 

the lines of farther experimental work and the degree of confidence with whioh we proceed 
provisionally to adopt a new technique. 

24. An experiment falling under this head has the advantage that the random process 
introduced is under complete control. The analysis will give an answer in probability terms 
whether the N individuals have been randomly selected from a larger whole or not. But this 
answer is limited in the sense that it relates only to the N; if we wish to draw conclusions 
about a wider population or populations, then a random selection of the N or, separately, 
of both its parts m and n is needed. Thus we come to Problems II and III. 

26. Approximation to the hypergeometric terms. When dealing with small numbers, the 
calculation of the tail terms of the series may not be laborious, but it soon becomes so when 
r is large. An obvious approximation is that obtained by using an integral under the normal 
curve with the mean and standard deviation of equations (3) and (4) to represent the sum 
of the hypergeometric terms. As usual when approximating to the sum of the terms for 
x = a, a+1, a + 2, ..., etc., of a discrete probability distribution by the integral under a 
continuous curve, we take this integral from the point x = a — Thus Fig. 3 shows the 
normal curve j 

V (*) = j(2nj o - - 6Xp t ~ “ “)*/*«]> ( 6 ) 

with a and cr a as in equations (5), and the approximation to the sum of the hypergeometric 
terms for a = 6 and 7 is 

p{x)dx, 

J 5.5 

represented by the area marked with cross-hatching. The approximations for different 
levels are shown in Table 2, and are seen in this case to be quite adequate for the purpose of 
the test. Further comparisons are made in the Appendix, and it appears that provided m 
and n are fairly nearly equal, as they are likely to be in most planned experiments of the 
Problem I type, the normal approximation is surprisingly good. Yates (1934) has suggested 
a method of further correction. 

20. The correction for continuity . In the 2x2 table connexion, the improvement obtained 
by taking the normal integral (i) from x — a — £ if a > d or (ii) from x — a + \ if a < a (so that 
we are summing for the lower tail), was pointed out by Yates (1934) and has often been 
termed ‘Yates’s correction for continuity’. It is, however, the natural adjustment to make 
on the basis of the Euler-Maclaurin theorem, when approximating to a sum of ordinates 
by an integral and without wishing to detract from the value of Yates's suggestion in this 
particular problem, it should be pointed out that the adjustment was used by statisticians 
well before 1934, when employing a normal or skew curve to give the sum of terms of a 
binomial or hypergeometric series.* 


(v) Problem II 

27. This may be described as the test of whether the proportion of individuals bearing a 
character A is the same in two different populations, from each of which a random sample 
has been drawn, i.e. the test of the hypothesis that 

Pi(A) = p t (A) = p, (7) 

* The method was in use in the Department of Applied Statistics when I joined the staff in 1921, 
and mav have been current manv veers before that. 


IO-2 



148 Choice of statistical tests 

where p is some common but unspecified proportion. Barnard describes this as the case of 
the 2x2 comparative trial. Here m individuals have been drawn at random from the first 
population and n from the second, and it is found that a/tn and b/n, respectively, bear the 
character A. The conditions are assumed to be such that if the random procedure of selection 
were repeated, the appropriate probability distributions for a and b would be given by the 
terms of binomial expansions. Table 3 shows the observed results. 


Table 3 



No. with 
character A 

No. 

without A 

Total 

1st sample 

a 

c 

m 

2nd sample 

b 

d 

n 

Total 

r \ 

— j 

8 

N . 


o 



Fig. 3. The curves ABC and A'B'C' represent the significance contours L e and L', respectively. 

In this problem there have been two applications of a random selection process, not one 
as for Problem I, and the experimental probability set consists of the (m+ 1) (n-f 1 ) alter- 
native values of the doublet (a, b) (0 < a ^ m, 0 ^ b < n) which can be represented in the lattice 
diagram shown in Fig. 3 for the special Case m = 12, n = 8. It might, of course, be argued 
that in the hypothetical repetition of the selection process m and n need not remain constant, 
but this, I think, would introduce an unnecessary complication into the probability set-up. 




E. S. Pearson 


149 


28. The question before us is whether the result (a, 6) is consistent with the hypothesis 
H q defined in equation (7) above, or whether it suggests that either p v >p% or that p x < p z . 
A little reflexion shows that we have no reason to reject H 0 if the point (a, 6) lies near the 
diagonal line on which aim — b/n, but, broadly speaking, are more and more likely to do so 
the farther the point falls from this line in the direction of the corners (0, n) and (m, 0) of the 
lattice diagram. This statement requires amplification. In defining the significance contours 
we may consider the following question: If H n is not true, what departures from equality 
in p 1 and p t do we regard it of equal importance to detect? Should the power of the test be 
roughly the same for constant values, for example, of 

(a) p!-p 2 , ( b ) pjp t or (c) 

The procedure which I have adopted in the sections which follow is frankly one of ex- 
pediency. I have not considered in detail how to choose a family of significance contours 
satisfying requirements formulated in advance, but have taken those suggested by the 
customary large-sample procedure which gives contours of the form ABC , A'B'G’ drawn 
in Fig. 3. These will, I believe, make the power of the test to detect a difference more nearly 
dependent on the ratio of the odds given by (c) than on either of the expressions (a) or (6). 
E. B. Wilson (1941) chooses the expression (a). This point, however, needs further investiga- 
tion. It should be noted that a similar problem, in the case where the sampling distribu- 
tions follow the Poisson law, was discussed very fully by Przyborowski & Wilenski (1939). 

29. Besides involving a 2-dimensional instead of a 1 -dimensional experimental pro- 
bability set, Problem IT differs from Problem 1 in that we need an answer which is indepen- 
dent of the unknown common probability p of the null hypothesis. In Problem I the part 
of p was played by the fraction r/N given by the data. We are concerned' now with what 
Neyman and I (Neyman & Pearson, 1933) have termed a composite hypothesis, and were 
it possible would like the contour levels to bound regions which are ‘similar to the sample 
space with regard to the parameter p' (loc. cit. p. 313) (i.e. are independent of p). The 
following considerations show the lines along which a first attack of the problem can proceed. 

30. If H 0 is true and equation (7) holds, then the probability of the observed result may 

be written* n \ 

P 2 {a | p, m ) x l\{b | p, n} = a ~ p"( l-p) c x ™ p h { 1 -p) d (8- 1 ) 


N\ 

r! s! 


p r ( 1 — p) 8 x 


m\n\r\8\ 

a\b\c\d\N\ 


( 8 - 2 ) 


= P 2 {r | p, iV} x P v {a | N, r, m}. (8-3) 

Thus the probability of obtaining the doublet (a, 6) in sampling from two populations with 
a common p may be regarded as the product of two terms: 

(i) The probability that a + b = r or that the point (a, b) in Fig. 3 falls on a diagonal line 
on which r = constant. This probability, P 2 {r | p,N}, is the (r + l)th term in the expansion 
of the binomial 

((i-p)+f>) A '. 


(ii) The relative probability, given r, of the observed partition into a and b = r — a; this 
is independent of p and is identical with the expression P x {a | N> r, m } of equation (1), i.e. is 
proportional to a term of the hypergeometric series (2). 


* It will be seen that P t { } has been used to denote a hypergeometric probability and P*{ } o 
binomial probability. 



150 


Choice of statistical tests 

31. If, now, it were possible to draw a boundary line L t such as ABC shown in Fig. 3, 
cutting off at the end of each diagonal, r = constant, a group of points (a, r—a) such that 

23 [/*{« I tf, r,»}]-«, (9) 

a 

where e is a fraction between 0 and 1 chosen at will, then the requirement of Step 3 would 
be satisfied. For in rejecting H 0 when (a, 6) fall beyond this boundary,* the chance of doing 
so if H 0 were true would be 

2 [*a{ r | p> N} x e] = e x £ [P 2 {r | p , #}] = e, (10) 

r-0 r« 0 

i.e. would be independent of the unknown common p of the hypothesis tested. The test 
would then be analogous to ‘ Student’s ’ test for the significance of the difference between 
two means, where we have a system of contour levels L e each associated with a chance e, 
independent of the values of an}' unknown parameters which are irrelevant to the com- 
posite hypothesis tested. 

32. Unfortunately, this objective cannot be achieved because we are not dealing with 
continuous probability distributions and P x {a | N, r, m) exists only at discrete, integral 
values of a. If we follow the present line of approach, all that is possible is to take contour 
or significance levels which cut off from an end of each diagonal, r = constant, a group of 
points for which 

'L[P l {a\N^m)] = (i r <e. ( 11 ) 

a 

Then, in rejecting i/ 0 when (a,b) falls beyond such a contour, we know that the chance of 
doing so, if H 0 is true, will be 

2[P 2 {r|p,iV}xyS r J<6. (12) 

r=0 

It is clear that the amount by which the probability falls below e will be a function of p , 
and that in taking Step 3 we are only associating with each significance level L e an upper 
limit, e, to the probability of rejecting H 0 when it is true. 

33. We have still, of course, to determine the most appropriate system of significance 
levels and to set out a ready means of finding an upper limit, e, associated with the level on 
which an observed doublet (a, b) falls. f Mr Barnard has broken new ground in 

(i) defining for this Problem II one systematic method of determining a family of levels 
L e based on certain clearly defined principles; 

(ii) determining the true upper bound to the associated probability e which, in the case 
of small samples at any rate, may be considerably below that which has hitherto been used. 

Since, however, much tabling is needed before his theoretical advance can be followed 
by a practical working rule available for samples of any sizes, m and w, I think it is worth 
while describing the cruder handling of the lattice diagram which I had discussed in 1938-9 

* There would be a similar series of boundaries, L', below the diagonal a/m = 6/n, such as A'B'C' 
of Fig. 3. 

t The likelihood ratio A might be used in determining the family of significance contours, as was 
suggested in connexion with the general X* problem (Neyman & Pearson, 19286, p. 283). In large 
samples A would approximately equal where u is given by equation (22) below. 



E. S. Pearson 


151 


lectures. This involves, perhaps, not much more than a restatement of what may be termed 
the classical approach to Problem II (see paras. 43 and 44 below), but it does bring out the 
difference between Problems I and II, which I think important. 

34. It may be well to emphasize here that this distinction between the handling of 
Problems I and II is not universally accepted. Fisher has set out his approach as follows in 
a paper read before the Royal Statistical Society (1935): ‘ To the many methods of treatment 
hitherto suggested for the 2x2 table the concept of ancillary information suggests this new 
one. Let us blot out the contents of the table, leaving only the marginal frequencies. If it 
be admitted that these marginal frequencies by themselves supply no information on the 
point at issue, namely, as to the proportionality of the frequencies in the body of the table, 
we may recognize the information they supply as wholly ancillary; and therefore recognize 
that we are concerned only with the relative probabilities of occurrence of the different ways 
in which the table can be filled in, subject to these marginal frequencies.’ 

This view has also been supported by Yates (1934). As I understand it, Fisher would refer 
the observation (a,b) to a linear set (as in my Problem I), however the data have been 
collected; this attitude follows readily if we discard the requirement that the probability 
distribution used in the test must be related to the frequency distribution that would be 
generated by repeated application of the random sampling process employed in the experi- 
ment. It will be seen that with Fisher’s approach there is a gain in simplicity in handling 
the analysis ; it must remain a matter of opinion whether there is a loss in the relevance 
of the probability construct to the question at issue. It is, of course, only when handling 
small samples or in cases where (a, b) lies close to one of the comers (0, 0) or (m, ») of the 
lattice that this need for choice between probability constructs is thrust upon us. 


(vi) Solution or Problem II, using the normal approximation 

35. If the samples are large, the calculation of hypergeometric terms becomes laborious 
and we turn naturally, as in so many other statistical problems, to the approximation using 
the normal curve. In fact, except when r or s are very small or m and n very different in 
magnitude, the normal curve with mean and standard deviation given by equations (3) 
and (4) provides a surprisingly good approximation to the relative probability distribution 
of a for fixed r, viz. P x {a | N, r, m} (see Appendix). Define u t as the deviate of the standardized 
normal curve for which 

/•OD 1 

e~* w 'du 


__ f® 1 

" Ju,v'< 2 


,/<2 7T)' 


(e«*)- 


(13) 


Then we can draw across the lattice diagram a significance level L t above and another L\ 
below* the diagonal a/m = bjn such that 

(i) all points (a, b) for which 

<u) 

lie beyond, i.e. above, L e \ 

(ii) and all points (a, 6) for which 

(«-*)- 


•« 




( 15 ) 


lie beyond, i.e. below, L' e . 


• The words ‘above’ and ‘below’ are used in the sense of Figs. 3 and 4. 



152 


Choice of statistical test $ 

If we wish to take special action either when a/m is significantly less than bjn or significantly 
greater, then we shall use both levels L e and L' f ; if only, however, when a/m < 6/n, then we use 
L e . The corresponding probability levels would be obtained by making e for the second case 
twice its value for the first. Fig. 4 shows the 247 relative probabilities P x {a | N, r, m} for the 
case m = 18, n — 12. The unbroken, stepped lines are two contour levels determined in this 
way. Purely for convenience in drawing, the level with e = 0*05 and u 0 . 05 = 1*6445 has been 
put above the diagonal and that with e = 0*01 and % 01 = 2*3263 below. 

36. If the normal approximation to the hypergeometric series were correct, it would 
follow that along every diagonal, r = constant, the sum of the relative probabilities for 
points above L e would satisfy the inequality (11). Hence the inequality ( 1 2) for the complete 
area of the lattice above L c would hold, whatever the value of the common p, A similar result 
would hold for the area below L'. Of course, the normal approximation will not hold pre- 
cisely, particularly when r or s are small, but here we shall generally be on the safe side, in 
the sense that the hypergeometric distribution is flat-topped with abrupt ends so that the 
j3 r of equation (11) will be considerably less than e, and often zero. 

37. It is interesting to examine the results set out in Fig. 4 with the help of the detailed 
calculations given in Table 4. Columns (2) and (3) give, for constant r, the mean and standard 
deviation of P x {a | 30, r, 18}, while columns (4) (for £0-05) and (8) (for L' 0 . 0l ) give the cut-off 
points defined by the normal approximation, i.e. 

a i = « - \ ~ ^0 05 x <r a and a 2 = d + l + %oi X(T <r ( 1 #) 

The sums of the relative probabilities P x {a | 30, r, 18} for a^a x and a^a 2 are given in 
cols. (5) and (9) respectively. Thus, for example, for r = 7 

a x = 4*2 — 0*5— 1*6449 x M543 = 1*80, 
and the sum of the probabilities for a = 0 and 1 is 

0*0004 + 0*0082 = 0*0086. 

These are the tail sums, termed fi r in equation (11). It is clear from an examination of 
cols. (5) and (9) that they are all less, and many of them very much less than 0*05 and 0*01. 
This is inevitable with a discrete distribution containing few terms. The contour levels have 
been drawn conventionally in Fig. 4 as steps passing through the half-integer points and 
not through the cut-off points of cols. (4) and (8). Clearly, whichever way they are drawn, 
they will separate off* the same subset of the (m + 1 ) (n + 1 ) points in the lattice diagram. 

38. The next question is this. If we were to use either of these levels, what in fact would 

be the chance of the sample doublet (a, b) falling beyond, if the null hypothesis were true? 
This will depend on the common value of p. The product sums *. 

2 Wi r I P, N} X p r \ = £ f -P) s x Arl (17) 

r~= 0 r^OL'^- J 

obtained by multiplying the expressions in cols. (5) and (9) of Table 4 by the appropriate 
binomial terms are shown for a variety of values of p in Table 5, cols. (2) and (3). It is clear 
at once how far on the safe side we are in saying that these chances are < 0-05 and 0-01 
respectively. Similar calculations were carried out for a second example, taking m = n — 10, 



E. S. Pearson 


153 



Normal curve approximations to significance levels 




Table 4. Significance levels far case m = 18, 




E. S. PlSABSOK 


155 


and the results are shown in Table 5, cols. (6) and (7). In this case, the actual chances of 
(a, b) falling on or beyond the significance levels are even further below the nominal limits 
of 0*05 and 0*01. In fact, it becomes clear that in the case of small samples, at any rate, this 
method of introducing the normal approximation gives such an overestimate of the true 
chances of falling beyond a contour as to be almost valueless. 

Table 5. Showing the difference between nominal and actual significance levels 


V 

(ifffo 

true) 

1st example: m =18, n = 

12 

2nd example : m = 10 = 

n 

...... ... 

V 

(if H 0 
true) 

Method 1 

Method 2 

Method 1 

Method 2 

True chance of 
falling on or 
beyond 

True chance of 
falling on or 
beyond 

True chance of 
falling on or 
beyond 

True chance of 
falling on or 
beyond 



L'o-qi 

^ 0*05 

£Voi 

A >-06 

L'o.oi 

^005 

L'o.oi 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

(9) 

(10) 

005 

0*0010 

0*0000 

0*0478 

0*0000 

0*0000 

0*0000 

0 0009 

0*0000 

0*05 

01 

0*0054 

0*0000 

0-0602 

0*0003 

0*0005 

0*0000 

0*0251 

0*0005 

0*1 

0-2 

0*0141 

0*0003 

0*0483 

0-0043 

0*0037 

0*0007 

0*0455 

0*0037 

0*2 

0*3 

0*0174 

0*0012 

0*0490 

0*0091 

0*0058 

0*0014 

0*0495 

0*0058 

0*3 

0-4 

0-0204 

0*0023 

0*0542 

0*0108 

0*0002 

0*0017 

0*0540 

0-0062 

0*4 

0-5 

0-0219 

0*0028 

0-0498 

0*0109 

0*0002 

0 0015 

0-0672 

0-0062 

0*6 

0*0 

0*0221 

0*0035 

0*0437 

0*0119 


Repeat as for l—p 


0*0 

0*7 

0*0204 

0*0037 

0*0431 

0*0120 





0*7 

0*8 

0*0120 

0*0031 

0*0459 

0*0113 





0*8 

0*9 

0*0019 

0*0009 

0*0282 

0*0052 





0*9 

0*95 

i 

! 0*0001 

i 

0*0001 

0*0058 

0*0010 





0*95 


39. Before considering a second method, it will be useful to recapitulate certain character- 
istics of what 1 have termed Method 1 . It provides for any nominal value of e one systematic 
procedure of defining a critical boundary or significance level cutting off a region from the 
lattice diagram. Neither the subgroup of points cut off, nor the sum of the probabilities 
associated with them for a given p, will alter continuously with e; they will change by discrete 
steps as the cut-off point, defined in para. 37, passes through a point (a, b). While we shall 
sometimes want to know whether the observed (a, b) falls beyond a level L t specified in 
advance, more often we shall ask what is the level on which (a, b) falls. This, using Method 1, 
we find by calculating 

«-(«+£) r - a-\-a _ 

u as — if a <a or « = if a>a, (18) 

O'o 

and finding e from the normal integral of equation (13). In this way the nominal chance e 
will be a little nearer the true upper limit than the figures in Table 5 suggest,* but not enough 
to modify the criticism expressed above. 

* It will be seen from Table 4 that no point (o, b) gives a (j r in cols. (5) and (9) of exactly 0-05 or 0-01, 
respectively, so that no points actually lie on or Lq . m . 



156 


Choice of statistical tests 

40. Method 2. The introduction of the correction of $ for continuity is certainly ap- 
propriate in using the normal approximation to the hypergeometric series in Problem I, 
but I think it is not helpful in Problem II where we are concerned with a 2-dimensional 
experimental probability set. If instead of obtaining significance levels L ( . and L' e as in 
paras. 35-37, we obtain them from inequalities similar to (14) and (15) but with the correc- 
tion of \ omitted, then there are several points to be noted: 

(a) For the significance level L e , the expression 

= (19) 

a 


where the summation is for values of a on the diagonal, r = constant, for which 


a - a- u e x <r (t (20) 

will be sometimes less and sometimes greater than e. Hence, in the balance, it seems likely 
that the chance of the point (a, 6) lying beyond L e or 


A rN' “1 


( 21 ) 


w ill lie closer to e than w'hen the £ correction is used. The position will be the same for L [ . 

(b) In drawing repeated samples of m and n from two populations in which there is a 
common chance, p, of an individual possessing character A , the ratio 


a — a _ a — rmjN 


u = 


r a j mnrs 

aJNHN-1) 


( 22 ) 


has, whatever be p, (i) an expectation of zero, (ii) a unit standard deviation.* The shape of 
the distribution will, of course, depend 6n p, but, faut de mieux , we may not in the long run 
do too badly by assuming it to be normal. It is, of course, the weighted combination of a 
number of hypergeometric series whose shape depends on r. 


41. Consider the result of applying this Method 2 to the case m = 18, n — 12 already 
discussed. The procedure for determining the 0*05 and 0*01 significance levels will be exactly 
as under Method 1, except that the continuity correction of A is omitted. The resulting levels 
are shown as dashed, stepped lines in Fig. 4.f They fall, on the whole, inside the significance 
levels obtained by Method 1. Now turn to Table 4, w here cols. (6) and (10) show the cut-off 
points a half unit further in towards the diagonal ajvn = bln. Cols. (7) and (11) give the values 
of some of these are considerably above the nominal values of e = 0*05 and 0-01, others 
are still well below. But from the approach to Problem II that has been adopted, this is 
immaterial since the experimental probability set is the 2-dimensioned one of the lattice 
diagram and is not restricted to the diagonal r — constant on w T hich the observed point 
(a, b) may happen to lie. What we are concerned with is the summed chande given by expres- 
sion (21 ) and the value of this is given for eleven values of p in cols. (4) and (5) of Table 5. It 
will be seen that this true chance does sometimes exceed the nominal values of 0*05 and 0*01 , 


* Provided cases where r or 8 are /.ero, making the expression (22) indeterminate with w = 0/0, 
are excluded. Mr Barnard has pointed out that one way of avoiding this exclusion would be to lay 
down that, when w = 0/0, we assign to the ratio a value chosen at random from a population (say 
normal) with zero mean and unit variance. 

f Again, for convenience the 5 % level is drawn above and the 1 % level below the diagonal. 



E. S. Pearson 


157 


but never by very much. Again, for the second example with m = 10 = n (Table 5, cols. 
(8) and (9)) the true chance, while it sometimes exceeds the nominal value, is always con- 
siderably nearer it than using the significance levels of Method 1. 

42. It is clear that no final conclusions can be based on two numerical examples, but it 
seems that the test of the null hypothesis in Problem II should be carried out as follows: 

(а) When m, w, r or 8 are small, with the help of tables prepared on Barnard’s lines, based 
on an ordered classification of the points in the lattice diagram, and giving the true upper 
bound of the chance that a point (a, 6) falls on or beyond the level on which the observed 
result lies. The particular basis of his classification may, of course, be modified. 

(б) When m, n f r and 8 are large, by assuming that the u of equation (22) is a normal 
deviate with unit standard deviation. 


(vii) The classical approach to Problem II 

43. It has recently become customary to regard the test of significance applied to data 
given in a 2 x 2 table as the limiting case of a y 2 test with one degree of freedom. But Problem 
II was originally answered in somewhat different terms. It was noted that if 


Pl (A) = p 2 (A) = P* (23) 

then the fractions a/m and bjn would both have expectations of p and variances of p( I — p)/m 
and p( 1 — p)jn , respectively. Hence, if the null hypothesis were true, the difference 


would have 


d — a ^ 

vi n 

mean d = 0 




(24) 

(25) 


In large samples, therefore, it might be expected that 

d = aim -b jn 

o',/ \\p(* -p) (!/»»+ I In)] 

would he approximately normally distributed. Since by the nature of the problem the 
common value of p was unknown, an estimate was made from the sample, namely, 


a + b __ r 
m + n X * 


(27) 


Substituting this into equation (26), we have 

d _ alm — bjn 

*a ~ Vr(rM T )(l -r/N) (l/m+ Yfn)] 
a — rm/N 
/mnrs\ * 




X* j 


(28-1) 

(28-2) 


44. The form (28*2) is easily derived from (28- 1 ), if we remember that 6 = r — a, 8 = N — r 
and m + n = N* It is seen that the ratio djs {l is identical wit h the ratio n of equation (22), 
except for a factor ^[( N— l)/A T ] which is unimportant in large samples. Thus the classical 
test is practically identical with that suggested in paras. 40-42 above, though the two tests 
are differently derived. 


♦ A third alternative form is, of course, {ad — be) y N 



158 


Choice of statistical tests 


(viii) Problem III 

45. This may be described as the test for the independence of two characters A and B. 
It is supposed that the probability that an individual selected at random will possess 
character A is p(A) and that he will not possess it is p(A) = 1 —p(A). The corresponding 
probabilities for character B are p(B) and p{B) = 1 — p(B). Four alternative combinations 
of the characters may occur, which may be denoted by AB, AB, AB and AB. The various 
probabilities are set out in Table 6 A. If the null hypothesis, H 0 , specifying the independence 
of A and B is true, then 

p(AB) =p(A)xp(B), p(AB) = p(A)p(B), etc. (29) 

To test the hypothesis, we have a random sample of N observations with frequencies of 
occurrence of the combinations A B, AB, etc., which may be classified in the 2x2 scheme of 
Table 6B. The sampling conditions are such that the probabilities of Table 6 A are the same 
for all individuals selected, or, in conventional terras, the sample is drawn from an infinite 
population. Barnard calls this problem that of the double dichotomy. 


Table 6 A. Probabilities 


Table 6B. Sample data 



A 

i 

Total 

B 

a 

c 

m 

B 

b 

d 

n 

Total j 

. . I 

r 

8 

N 



A 

A 

Total 

B 

P(AB) 

p(AB) 

p(B) 

B 

p(AB) 

p(AB) 

p(B) 

Total 

p(A) 


1 


46. In Problem III there is only one application of a random process, the selection of N 
individuals, each one of which must fall into one or other of four alternative categories. If 
the random process were repeated and another sample of N drawn, not only are the fre- 
quencies a, b, c and d free to vary, but also both marginal totals, i.e. m may change as well 
as r. The experimental probability set will therefore contain results (a,b,c,d) restricted by 
the conditions (i) that none of the frequencies can be negative and (ii) that 

a -+■ b ■+■ c •+■ d = A. (30) 

Geometrically, as Barnard points out, the set can be represented in 3 dimensions by points 
at unit intervals within a tetrahedron obtained by placing on top of one another the series 
of 2-dimensioned lattices of dimensions 

0 xn, lx(n-l), 2x(w - 2), (m-l)xl, mxO. (31) 

« 

47. We are again testing a composite hypothesis and should like to determine a family of 
critical surfaces to be used as significance levels, dividing the points within the tetrahedron 
in such a way that the chance of the sample point (a, 6, c, d)* lying outside a given surface 
L e is equal to e, whatever the values of the unknown probabilities p(A) and p(B). But 
again, as in Problem II, owing to the discontinuity in the set of points, there are no 'similar 

* In view of the condition (30), the point can be defined by three co-ordinates, e.g. as (a, 6, c), 
(a, b, m) or (a, r, m). In view of the form of equation (32), the last system of co-ordinates will be used. 



E. S. Pearson 


159 


regions’. We note that if H 0 is true, the probability of the observed result is a term of the 
multinomial expansion, viz. 

aTfelifd! p(AB)°p(A5)» p(lBYp(IB)« 


N\ - 

= - a Vb\hi\ p(A)a * p{B)a * p{Ar+ap{B)b+d 

V I VI 

= ^p(Br(i-p(B))»x^p(Ar(l-p(A)Yx 


t»!n!r!«! 

a\b\c\d\N\ 


= -Ftfm | p(B), N} x P t {r \ p(A), N } x i\{o | N, r, m}. (32) 

Here, the notation of para. 30 has been repeated. 

48. Thus the probability of obtaining a sample represented by the triplet (a, r, m) may be 
regarded, if the characters A and B are independent, as the product of three terms: 

(i) The probability of drawing m individuals with character B in a random sample of N, 
i.e. the probability that (a,r,m) falls in a horizontal section of the tetrahedron on which 
m = constant. This is the (m + 1 )th term in the expansion of the binomial 

(ii) The probability of drawing r indivi4uals with character A in a random sample of N , 

i.e. the probability that falls on the vertical section of the tetrahedron on which 

r = constant. This is the (r+ l)th term in the expansion of 

{(1 -p(A))+p(A)}". 

(iii) The probability, given m and r, of the observed partition within the 2x2 table. This 
term represents the relative probability associated with the points lying along a straight line 
m = constant, r = constant; it is, of course, the same expression as has arisen in Problems 
I and II and is proportional to a term in the hypergeometric series F( — r , — ra,n — r + 1, 1). 

49. We are faced with a situation similar to that met under Problem II. Were it possible 
to cut off from each line on which m = constant, r = constant, a group of points such that 

v[Pi{a|^r,m}]==6, (33) 


then the subset of points within the tetrahedron composed of the sum of these groups for 
all possible combinations of m and r would have the property required of a ‘critical region’ 
in a significance test: i.e. the chance that the point (a, r, m) is included in the region, if H Q 
is true, would be e. whatever values the irrelevant probabilities p(A) and p(B) assumed. 
However, (33) cannot be satisfied in general, and all that is possible is to define a family of 
significance contours such that the chance of a sample point falling beyond any one of them, 
say L f , is ^ e. By using the normal approximation to the sum of the hypergeometric tail- 
terms with the correction for continuity as described in paras. 35-39 for Problem II, we shall 
be very much on the safe side, i.e. the formal level of e is likely to be much above the true 
chance of falling beyond the level, whatever b$p(A ) or p(B). The presence of the two binomial 
terms in equation (32) instead of the single term in equation (8-3), makes it likely that the 
overestimation of e will be greater in Problem III than in II. It is to be expected, therefore, 
that any any rate when neither w, w, r or s are too small, the better approximation will be 
obtained by referring the u of equation (22) to the normal probability scale. 

50. The handling of Problem III is discussed briefly by Barnard on p. 136 above. 
There is clearly room for further investigation. The general nature of the approximation 



160 


Choice of statistical tests 

involved is of course that which arises in every test for goodness of fit or for independence 
in an h x k table, where we replace a distribution consisting of a finite set of probabilities at 
discrete points in multiple space by a continuous distribution for which integration outside 
ellipsoidal contours is straightforward. 

(ix) General comment 

51. The duties of the statistician lie at many levels. He may be required merely to apply 
an established technique of analysis to an assembly of numerical data and this application 
may result in a statement, based on probability theory, of a ‘level of significance’ or a 
‘confidence interval’, which will be used by others. Or he may be called on to share in 
planning the investigation or experiment which is to provide the data and then to draw 
conclusions from their analysis which will lead to further action. In this final role he needs 
to bring into play faculties which are no monopoly of his calling, the qualities of sound 
judgement which are the characteristics of a well trained, scientific mind. In the weighing 
of evidence, the result of the statistical analysis, expressed in one or more conventional 
probability figures, is only one factor in the summing up; as important, may be, is the 
question of whether the mathematical model is a fair counterpart to the happenings in the 
observational field. In addition, there will often be much information coming from outside 
the range of the immediate investigation, yet hardly expressible in numerical terms, which 
must influence decision. 

52. It is perhaps hard experience gained in certain fields of war-time research, where 
decisions had to be reached on statistical data far less ample than could be wished, which has 
forced mv own attention to this question: What weight do we actually give to the precise 
value of a probability measure when reaching decisions of first importance ? One subject for 
examination falling under this inquiry is clearly the logical basis of the reasoning process 
by which judgement is influenced as a result of the application of a test of significance. This 
was the theme on which this paper opened. The approach illustrated in the pages which 
followed is a personal one and is set down, with no claim to be the best, in order to provoke 
thought and discussion. There appears no short route to a right answer in this matter; each 
individual who hopes to use his own judgement to the full in drawing conclusions from the 
statistical analysis of sampling data, must decide for himself what he requires of probability 
theory. 

53. In the approach which I have followed and illustrated on the analysis of data classed 
in a 2 x 2 table, the appropriate probability set-up is defined by the nature of the random 
process actually used in the collection of the data. Consideration of this point forms the 
initial step in the determination of the appropriate test. On this score, what I have termed 
Problems I, II and III are differentiated. The difference is fundamental and lies at the 
bottom of the dilemma to which the Barnard-Fisher correspondence in Nature drew atten- 
tion. It can be illustrated on the following data, given in Table 7, wherel shall suppose that 
the effect we are interested in is that making a significantly greater than 6. 

54. If (a) the results have been obtained by random assignment of Treatment 1 to 
eighteen out of thirty individuals and Treatment 2 to the remaining twelve, and 

(b) we merely ask whether the results are consistent with the hypothesis that the treat- 
ments are equivalent as far as these thirty individuals are concerned, so that the difference 
between the proportions 15/18 and 5/12 may reasonably be ascribed to a chance fluctuation, 



E. S, Peabson 


161 


(c) we are then concerned with Problem I, i.e. simply with the probabilities associated 
with the points (a, 20 - a) on the diagonal r = 20 of Fig. 4. The chance of getting a > 15, if 
the null hypothesis is true, is 0*0241 ,* or, using a common phrase, we can speak of the result 
being significant at the 2*5 % level. 

55. On the other hand, if a sample of 18 has been drawn randomly from one population 
and a sample of 12 independently from a second and we wish to test whether Pi(A) = 
then it seems to be an artificial procedure to restrict the experimental probability set to 
the 1 1 points on the line r = 20, i.e. to the values of a : 8, 9, . . . , 1 8. A repetition of the double 
sampling process could give us a result (a, 6) falling at any of the 19 x 13 = 247 points in 
the lattice diagram of Fig. 4. There will be a number of ways of defining a family of signi- 
ficance levels for this 2-dimensioned set; if we adopt that discussed in paras. 40-41, which 


Table 7 




Frequency of results 


For problem I 

For problem 11 




Total 



A 

A 


1st treatment 

Sample from 1st population 

a = 15 

c — 3 

m — 18 

2nd treatment 

| Sample from 2nd population 

b — 5 

i 

</ = 7 

n = 12 


Total 

j 

r = 20 

1 

| #=10 

1 J 

N = 30 


gives as two of its members the dotted, stepped lines shown in Fig. 4, we can say that the 
chance of a result falling beyond the lower line is certainly less than 0*01 5. f The observed 
point, with a = 15, b = 5 falls beyond the line, so that the result is undoubtedly ‘significant 
at the 1*5 % level ’. 

56. These two probabilities, 2*5 and 1*5 %, are not the same, but there is no inconsistency 
in their difference. The character of the two investigations is different and to treat Problem II 
as though it were Problem I seems to call for a probability set-up which is unnecessarily 
artificial, when a simpler one is available. Admittedly by getting what seems to me a closer 
relation between the probability set-up and the experimental procedure, we have sacrificed 
some simplicity in handling the 2x2 table. But this is only the case when dealing with 
small numbers. For large numbers the methods of handling Problems 1, II and III become, 
practically, identical. 

57. Consider again the heavy shell problem described in para. 7 above. If we are to intro- 
duce probability theory, it seems to me that we should regard the problem as one in which 
we have a sample of m = 12 from the possible output of shell made to one design or by one 
firm and of n = 8 from the possible output of a second. This sampling may be hypothetical 
in that these may be ‘pilot* shell, the first off production; nevertheless, this construct is 

* For the normal curve approximation, using the correction for continuity, we find 

u = (15-J- 12-0)/l-2865= 1*943. 

The proportionate area under the normal curve beyond this deviation is 0*026. 

t Table 5, col. (5) shows the largest value of this chance to be 0*0120 for p = 0*3. This figure cannot 
be much exceeded for other p’s though I have not determined the precise maximum. I give 0*016 as 
a safe-side limit. 

Biometrika 34 11 



162 


Choice of statistical tests 

clearly less artificial than one in which, on the null hypothesis, we regard the experiment as 
though it were made on twenty shells, to twelve of which has been randomly assigned the 
label ‘Made by firm X ’ and to the other eight, ‘Made by firm Y\ 

58. It is clear that in the heavy shell problem there may be many reasons to doubt 
whether tlie rounds fired can be regarded as a random sample from future output. That is 
why I have emphasized that the exploration which the statistician makes in private will not 
necessarily be presented in figures at the conference table. In this example, the proportions 
of successful perforations were 2/12 and 5/8; these put us on the line, r = 7, of the lattice 
diagram for which the hypergeometrio probabilities were shown in Fig. 2. The sum of the 
terms with a ^ 2 is 5*2 % (normal approximation, using the ^-correction, 5*6 %). This is 
the chance of getting as great or a greater positive difference, b - a, if # 0 were true, treating 
the case as Problem I. Barnard’s method has not yet been extended to cover this case, 
but if we were to use the large sample method for handling Problem II, described in my 
paras. 40-41, we should find from equation (22) that 

u = (2 — 4*2)/l*072 = —2*05, 
which puts (a, 6) outside the upper 2*5 % level. 

59. Were the action taken to be decided automatically by the side of the 5 % level on 
which the observation point fell, it is clear that the method of analysis used would here be 
of vital importance. But no responsible statistician, faced with an investigation of this 
character, would follow an automatic probability rule. The result of either approach would 
raise considerable doubts as to whether the performance of the first type of shell was as good 
as that of the second, but without the whole background of the investigation it is impossible 
to say what the statistician’s recommendation as to further action would be. 

60. In the example of the proof of anti-tank shot discussed in para. 6, the chance of 
perforation, p , while varying from plate to plate and batch to batch, will almost certainly 
not range through the whole interval 0-1. The striking-velocity of the shot would also 
probably be adjusted so that for average proof-plate and batches, p was near J. Then the 
discriminating level (or levels*) set across the 13x13 lattice diagram would be fixed paying 
regard to the likely variation in p\ thus a fairly close upper limit could be calculated to the 
true probability of (a, b) falling beyond the level if the fresh batch were of the same quality 
as the standard. This is the upper limit of the risk of segregating the batch wrongly. 

61. Precisely similar problems arise for consideration in even more difficult form in the 
analysis of data arranged in a hxk table, where h or k or both are > 2. It has become common 
practice to speak of the solution of this problem in terms of ‘fixed marginal totals’, but it 
may be questioned whether the restriction in the experimental probability set implied is 
generally appropriate. The frequencies in a h x k table may have been obtained by many 
different sampling procedures for, as in the 2x2 problem, a single form of tabular presentation 
will follow from a variety oftypesofinvestigation. Formostof these, arepetition of the random 
process of selection would give results with either one or both sets of marginal totals changed. 

62. For convenience in solution we may, of course, start by considering the distribution 
of our test criterion, on the null hypothesis, within the sub-set of results for which the margins 

* It is possible that two levels might be taken with the associated proof rules: (i) if (a, b) falls beyond 
the outer one, reject the batch ; (ii) if between outer and inner, fire further rounds ; (iii) if within the 
inner level, accept the batch. 



E. S. Pearson 


163 


are fixed. If this distribution were the same whatever these fixed values, then the overall, 
distribution for unrestricted sampling would be the same as that for variation subject to 
fixed margins, Thus, mathematically, the solution of the partial problem would be a step in 
the solution of the complete one. But when applying x 2 analysis to an h x k table, this result 
is only true as a large-sample approximation. 

03. If we use the mathematical model which it is suggested gives the most direct aid in 
reasoning from the observations, i.e. that which regards the experimental probability set 
as generated by a repetition of the random process of selection used in collecting the data, 
then in the majority of cases we cannot regard the marginal totals as fixed. Thus a rigorous 
treatment would lead, as in the case of the 2x2 table, to a differentiation into a number of 
solutions. It is to be hoped, however,* unless the numbers in the margins are very small, 
that the x 2 approximation with its appropriate degrees of freedomt will give results which 
are not misleading. This approximation leads, of course, in the 2x2 table to the reference 
of the ratio u of equation (22) to the normal probability scale. Some aspects of the approxi- 
mation in this more general case were discussed by Yates (1934, pp. 233-35). 

64. In closing 1 should like again to acknowledge my indebtedness to Mr G. A. Barnard. 
Having had the good fortune to discuss these problems with him and see drafts of his work 
over a period of 2 or 3 years it is difficult to say how many of his ideas have been built un- 
consciously into my own earlier approach. But I am especially aware of the clarification 
which his emphasis on the distinction between Problems I, II and III brought to my survey. 
1 am also very grateful to Mr M. G. Kendall, Dr R. C. Geary and Dr B. L. Welch for a number 
of helpful criticisms, and to Mrs Maxine Merrington for her extensive computing work, which 
has alone made possible the various numerical illustrations that I have given. 

* From the point of view both of the exponents of the fixed marginal and unrestricted marginal 
approach. 

t The statement that, for example, in applying the test of independence of two characters to an 
hxk table, the degreos of freedom are (h — 1 ) x (le— 1), does not of course mean that sampling is re- 
stricted by fixed marginal totals. All that is implied is that approximately the overall distribution of 
the x * function of the observations used, is the same as that for sampling within the restricted sub-set; 
this is because the distribution within each sub-set is approximately independent of the particular 
marginal totals which define it. 


REFERENCES 

Barnard, G. A. (1945a). Nature , Lond ., 156, 177. 

Barnard, G. A. (19456). Nature , Lond., 156, 783. 

Fisher, R. A. (1935). J. Roy. Statist. Soc . 98, 39. 

Fisher, R. A. (1941). Science , 94, 210. 

Fisher, R. A. (1945a). Nature , Land.,, 156, 388. 

Fisher, R. A. (19456). Sankhyd , 7, 130. 

Kendall, M. G. (1943). The Advanced Theory of Statistics, 1. London: 
Charles Griffin and Co. Ltd. 

Nkyman, J. & Pearson, E. S. (1928a). Biometrika, 20 A, 195. 

Neyman, J. & Pearson, E. S. (19286). Biometrika, 20 A, 263. 

Neyman, J. & Pearson, E. 8. (1933). Philos. Trans. A, 231, 289. 
Neyman, J. & Pearson, E. S. (1936). Statist. Res. Mem. 1, 113. 
Neyman, J. & Pearson, E. 8. (1938). Statist. Res. Mem. 2, 25. 
Pearson, K. (1899). Phil . Mag. 47, 236. 

Przyborowski, J. & Wilenski, H. (1939). Biometrika , 13, 313. 
Wilson, E. B. (1941). Science , 93, 557. 

Wilson, E. B. (1948). Proc. Nat. Acad. Sd„ Wash., 28, 94. 

Yates, F. (1934). J. Roy. Statist . Soc. Suppl. 1, 217. 



[ 164 ] 


APPENDIX 


THE NORMAL CURVE APPROXIMATION IN PROBLEM I 


1 . The following Tables 8 and 9 (A), (B) and (C) show the order of accuracy which results 
from using the normal curve integral as an approximation to the tail sums in the series 


P x {o | N, r, m} = 


m\n\r\ s\ 

aTbTcJdfWl 


( 34 ) 


the terms of which are proportional to those in the hypergeometric series 

F(-r , -m, N -m-r+l, 1). 

Here a is a variable which can assume the range of positive, integral values indicated 
under (i), (ii) and (iii) in para. 20 above, while N , r and m are fixed. The relation 
between these quantities and 6, c, d , n and s is given in Table 1, para. 17. The method of 
approximation, using the correction for continuity, has been discussed in para. 25. 


2. Table 8 takes the case of an equal partition, m = n — £A T , and shows the sum of the 
terms in the expression (34) for which which is also the sum of terms for which 

a ^ r — a x . For m / n, results are given in Table 9 for m > n and for the following pro- 
portionate partitions of N : 

(A) m = §iV, (B) n = (C) n - frN. 

Here sums of terms at both tails of the series are needed. The sums (or chances of a ^ a x 
or have not been given for all possible values of a x but, broadly speaking, for those 
within the limits where significance is likely to be in question. Sums below 0*0010 have 
generally been omitted. In each case the true sum of the terms (34) is compared with the 
approximation from the normal integral. 


3. In drawing conclusions from the comparison, we have to decide what degree of 
accuracy is called for. Clearly the normal integral does not give mathematically exact 
results to 4 decimal places. On the other hand, except for certain instances where the 
partition is very unequal (m =£A and j^N) and r is small, the order of the approximation 
may be said to follow that of the series closely. If decisions are made by rule of thumb, 
according to the side of the 5% or 1 % significance level on which a falls, then there are 
a number of entries in the tables where the approximation would give a on the wrong side. 
But one may question whether judgement of significance based on a single experiment can 
in fact be made sensitive to a difference between, say, 0*06 and 0*04 (odds of 16 to 1 and 
24 to 1) or between 0-012 and 0-008 (odds of 82 to 1 and 124 to 1) and, given such latitude 
in accuracy, the approximation will be found generally sufficient.* These must be points, 
however, where personal opinions will differ. Whatever views are held, the tables are 
sufficiently extensive to make it possible to obtain from them a rough measure of the 
accuracy of approximation in a wide range of cases. ' 


4. It will be noted that in the symmetrical case (m = £A) and also when rn =$ N the 
normal approximation for the tail sum is almost invariably a little too large. Undoubtedly 
for the symmetrical case an improved approximation could be obtained by modifying 
the \ correction used in calculating the ratio of deviation to standard deviation. This 
second order term would, however, need to vary with the probability level, thus com- 
plicating the procedure, 



Appendix 165 

Table 8. Case of equal partition , m = n = %N. Chance that a ^ a x = chance that a^r — a x 


Partition 

m sb n 

= 50 

m as t 

l a* 30 

m st n = 20 

m = n as 15 

m = n= 10 



r 


True 

Normal 

True 

Normal 

True 

Normal 

True 

Normal 

True 

Normal 


r 


approx. 

approx. 

approx. 


approx. 


approx. 



17 

0-2566 

0-2574 

0-2194 

0-2212 







17 



18 

•1376 

•1388 

•0981 

•1002 







18 


30 

19 

•0630 

•0643 

•0348 

•0365 







19 

30 

20 

•0243 

•0253 

•0096 

•0106 







20 


21 

•0078 

•0085 

•0020 

•0024 







21 



22 

•0021 

•0024 









22 



12 

0-2269 

0-2278 

0-2080 

0-2076 

01715 

01745 





12 



13 

•1063 

•1068 

•0862 

•0873 

•0564 

•0592 





13 


20 

14 

•0392 

•0408 

•0270 

•0287 

•0128 

•0144 





14 

20 


15 

•0114 

•0126 

•0064 

•0073 

•0019 

•0025 





15 



16 

•0025 

•0031 

•0011 

•0014 







16 



9 

0-2884 

0-2887 

0-2760 

0-2772 

0-2572 

0-2695 

0-2330 

0-2364 



» 



10 

•1312 

•1325 

■1163 

•1185 

•0954 

•0985 

•0715 

•0755 



10 


15 

11 

•0453 

•0473 

•0358 

•0380 

■0242 

•0265 

•0134 

0156 



11 

15 

1 

12 

•0113 

•0129 

•0077 

•0090 

■0040 

•0049 | 

•0014 

•0020 



12 


! 

13 

•0019 

•0027 

! 

•001 1 

■0016 

i 

1 

i 

i , . i 





13 


i 

i 

i 

7 

01 589 

; 

01 599 

! 

0-1495 

01514 

0 1367 

01397 

0-1226 

01 266 

0-0894 

00955 

7 


! lo 

l 

8 l 

•0458 

•0486 

•0399 

•0429 

•0324 

•0357 j 

•0251 

•0285 ; 

•0115 

•0147 

8 

10 

9 ' 

•0078 

•0101 

•0061 

•0081 

•0042 

•0058 

•0026 

•0038 

•0005 

•0011 

9 


10 

•0006 

•0014 

! 

•0004 

1 

•0010 







10 


_____ 

5 | 

0-2179 

0-2177 

! ' * 

| 0-2119 

0-2126 

0-2038 

0-2056 

01 950 

■ 

01 980 

01 749 

01804 

5 


7 

6 

•0558 

•0594 

•0514 

•0553 

■0458 

•0501 

•0401 | 

■0448 

•0286 

•0338 

6 

7 

1 

1 

7 j 

•0062 

•0096 

•0053 

•0084 

•0042 

i 

t 

•0068 

•0032 

•0055 

•0015 

•0031 

7 


r * 

6 

u 

4 

01810 1 

01 806 

01 766 

01771 

01709 

01735 

0-1648 

01677 

01517 

01571 

4 

K 

1 6 

! 

•0281 

1 

•0339 | 

1 

•0261 

L 

•0320 

| 0236 

! 

•0295 

1 

0211 

i J 

•0270 

•0163 

[ 

•0220 | 

i 

5 

O 



106 Appendix 

Table 9. Case of unequal partition. Chances that a ^a 1 and a^a 1 

■ (A) m=|jy, n = {W 


Partition 

m 

n = 

= 60, 

= 40 

m = 36, 
n= 24 

tn = 

n = 

24, 

10 

m as 

n = 

18, 

12 

m 

n 

* 12. 

= 8 


r 

Chance 


True 

Normal 

True 

N ormal 

True 

Normal 

True 

Normal 

True 

Normal 


Chance 

r 


that 



approx. 


approx. 


approx. 


approx. 


approx. 


that 



- 

n 

0-0020 

00019 









11 





12 

•0074 

0074 

0-0016 

00020 







12 





13 

•0230 

0231 

•0084 

•0093 

__ 






13 




ft flj 

14 

•0601 

•0604 

•0320 

•0337 

0-0023 

0-0050 





14 

• «<a, 




15 

•1330 

•1339 

•0936 

•0957 

•0270 

•0329 





16 





10 

•2512 

•2531 

•2148 

•2165 

•1311 

•1348 





16 



30 















30 



20 

0-2533 

0*2531 

0-2148 

0*2165 

01322 

01 348 





20 

I 




21 

•1323 

•1339 

•0936 

•0957 

•0318 

•0329 





21 





22 

•0580 

•0604 

•0320 

•0337 

•0045 

•0050 





22 




t 

23 

•0200 

•0231 

•0084 

•0093 







23 

r tt (ij 




24 

•0061 

•0074 

•0010 

•0020 







24 





25 

•0014 

•0019 









25 

) 



( 

6 

0-0027 

00026 

00010 

00012 







6 





7 

•0114 

•0112 

•0060 

■0063 

0-0015 

0-0021 

— 




7 





8 

•0381 

•0378 

•0255 

•0262 

•0112 

•0128 

00015 

0-0033 



8 





0 

•1019 

■1021 

•0816 

■0829 

•0526 

•0555 

•0290 

•0260 



9 




\ 

10 

•2211 

•2232 

•2005 

•2028 

•1005 

•1695 

•1170 

•1218 



10 

) 


20 















20 



14 

0-2236 

0-2232 

0-2017 

0-2028 

01665 

01 695 

01 182 

01218 



14 





15 

•0994 

•1021 

•0798 

•0829 

•0526 

•0555 

•0241 

•0200 



15 




u ; a v 

10 

•0341 

•0378 

•0233 

•0262 

•0112 

•0128 

•0026 

•0033 



16 





17 

•0080 

•0112 

•0048 

•0003 

•0015 

•0021 





17 





18 

•0016 

•0026 

•0006 

•0012 







18 





4 

0-0053 

0*0053 

00032 

1 

00033 

0 0013 

1 

0-0015 


i 

; 

1 


4 

\ 




5 

•0236 

•0233 

•0171 

•0173 

•0098 

•0106 

0-0038 

0-0052 



5 





6 

•0776 

•0775 

•0650 

•0657 

•0481 

•0499 

•0301 

•0336 

.. — 


6 




1 

7 

■1948 

•1968 

•1804 

! -1827 

•1588 

•1618 

1317 

•1358 

00511 

00616 

7 

J 


15 






j 





- 




15 


( 

11 

01970 

01 968 

01814 

0*1 827 

01587 

0-1618 

0 1317 

01 358 

00578 

0 0616 

11 

| 




12 

•0734 

•0775 

•0614 

•0657 

•0458 

j 0499 

•0301 

•0335 

•0036 

•0051 

12 




a^di< 

13 

•0188 

•0233 

•0138 

•0173 

■0082 

| 0106 

•0038 

•0052 

— 

■ 

13 




l 

14 

•0029 

•0053 

•0018 

•0033 

•0008 

•0015 





14 

j 



f 

2 

00088 

00089 

00067 

0-0071 

0-0045 

00050 

00026 

0-0033 

00004 

00009 

2 





3 

•0457 

•0453 

•0395 

•0398 

•0318 

•0329 

•0241 

•0260 

•0099 

•0131 

3 




l 

4 

•1538 

•1549 

•1447 

■1464 

■1322 

•1348 

•1182 

•1218 

•0849 

•0910 

4 



10 















10 



8 

01539 

01 549 

01 442 

01464 

01311 

01 348 

0-1170 

0-1218 

00849 

00910 

8 




a^a, * 

9 

•0386 

! 0453 

•0334 

•0398 

•0270 

•0329 

•0209 

•0260 

* *0099 

•0131 

9 





10 

•0044 

•0089 

•0034 

•0071 

•0023 

-0050 

•0015 

•0033 

•0004 

1 

•0009 

10 




( 

0 

0 0012 

00022 

ootfoe 

00013 

0-0006 

0-0010 

00004 

0-0007 


; 

0 




as*a, J 

1 

•0156 

1 -0189 

•0134 

•0140 

•0109 

•0118 

•0086 

•0097 

00044 

0*0059 

1 

• a ^a x 


7 

1 

2 

•0884 

•0956 

•0827 

•0832 

•0756 

•0770 

•0681 

•0704 

•0521 

•0564 

2 




/| /y J 

6 

0*1 492 

0*1587 

01426 

01460 

01 341 

01378 

0-1250 

0-1300 

01056 

0-1127 

6 


1 


U ^ Uj J 

7 

*0241 

•0385 

•0216 

•0306 

•0186 

•0269 

•0166 

•0232 

•0102 

00160 

7 

f a^o, 



/I ^ /| J 

0 

0-0088 

00099 

0-0078 

0-0090 

00066 

00080 

0-0056 

00070 

00036 

00061 

0 




W^Oi j 

1 

•0816 

•0811 

•0778 

*0781 

•0730 

•0742 

•0681 

•0701 

•0578 

•0616 

1 

j- 


5 















5 


a>a t 

5 

0-0725 

0-0811 

00690 

0*0781 

00646 

0-0742 

0*0601 

00701 

0 0511 

00616 

5 

a^a x 



Appendix 


167 


Table 9 (continued) 

(B) m = fN,n = iN (C) n» = n «= tW 



Partition 

m = 80 

n = 20 

in = 48, 

n = 12 

ro = 32, n = 8 

r 

Chance 

a i 

True 

Normal 

True 

Normal 

True 

Normal 


that 



approx. 


approx. 


approx. 



18 

00018 

00014 







10 

•0084 

•0073 

00013 

0-0020 




a^a v 

20 

•0300 

•0288 

•0106 

•0125 





21 

•0884 

•0874 

•0621 

•0548 





22 

•2040 

•2078 

■1687 

•1685 



HO 











20 

0-2092 

02078 

01 667 

01685 





27 

•0824 

•0874 

•0521 

•0548 




a£a r 

28 

•0227 

•0288 

•0106 

•0125 





20 

•0039 

•0073 

•0013 

•0020 





30 

•0003 

•0014 






j 

11 

0-0040 

00026 

0 0013 

0*001 1 





12 

•0182 

•0148 

•0095 

•0087 

((•0016 

0 0031 


a^a r 

13 

•0038 

•0600 

•0460 

•0448 

•0218 

•0265 


l 

14 

•1729 

•1755 

•1523 

•1542 

•1176 

•1208 

20 









1 


18 

0-1758 

01 755 

01522 

0-1542 

01 176 

0-1208 



19 

■0499 

•0600 

•0371 

•0448 

•0218 

•0255 

i 

. i 


20 

•0006 

•0148 

•0041 

•0087 

•0016 

•0031 

i 

! 

I 


7 

0-0018 



1 0-0009 

0-0008 

0*0004 





8 

•0107 

' -0074 

•0064 

•0049 

(>•0022 

0-0024 



i 9 

•0402 

! -0408 

•0355 

•0323 

•0217 

•0219 



10 

•1470 1 

j *1480 

•1329 

•1338 

•1115 

•1133 

15 










n V/i ^ 

14 

01453 

0-1480 

0-1294 

0-1338 

0-1079 

01 133 

i i 

n s fu l j 

15 

0202 

•0408 

•0206 

•0323 

•0141 

*0219 











I 

! 4 

0-0039 

00019 

0-0026 

0-0013 

0-0012 

0-0008 



* 5 

•0254 

•0191 

•0206 

•0159 

•0145 

•0121 

10 | 

1 

i « 

•1095 

•1088 

■1012 

■0988 

♦0893 

•0882 

I 

i 

• 

1- » 1 

tl^tt i 

; 10 

i 

00951 

0-1068 

0-0868 

0-0988 

0-0761 

0-0882 

I ! 
i 

( 

i 

' 2 

00033 1 

! ■ ' ' 

00013 

0-0024 

0-0010 

~ 

0 0015 ! 

0-0007 

i 


: 3 ! 

! *0282 

•0203 

•0246 

•0181 

•0201 

[ -0155 

7 1 

! 

1 

j 4 

•1408 

i 1417 

•1354 

•1364 

•1281 

! 1293 

! 

1 


7 

01985 

0-1910 

0-1906 

0-1848 

0*1805 

0-1776 

i 

ft f | J 

1 

i 

00053 

00022 

0-0045 

0-0021 

00035 

00016 

5 

u ^ u i | 

2 

•0531 

•0434 

•0499 

•0430 

•0457 

■0383 



5 

0-3193 

0*2841 

0-3135 

0-2835 

0*3060 

0'2778 


Partition 

m = 90, 

n= 10 

r 

Chance 

that 


True 

Normal 

approx. 


/ 

22 

00009 

00006 



23 

•0073 

0057 


24 

•0388 

•0362 


l 

25 

-1384 

*1388 

30 

a t | 

29 

01356 

0*1388 


30 

•0229 

0352 



14 

00039 

0*0019 


a^a t 

15 

•0254 

0191 



16 

•1095 

•1068 

20 


20 

00951 

01068 


r 

9 

00006 

0 0001 



10 

■0063 

•0027 


11 

*0408 

0316 


1 

12 

•1705 

•1765 

15 


15 

0-1808 

0-1765 



5 

00006 

0*0001 


«««»] 

6 

•0082 

•0029 


7 

0600 

•0486 


l 

8 j 

•2615 

•2902 

10 

(l ^ Oj 

io ! 

0*3305 

0-2902 


/ 

3 

00016 

0 0003 


as$d,| 

4 

•0207 

•0096 


5 

•1442 

■1492 

7 

0^0, 

7 

0-4667 

0-3974 


a<a t | 

2 ' 

00067 

0-0006 


3 

•0769 

0538 

5 

| a^a, 

1 

i 

.5 

0-4163 

0-5000 




[ 168 ] 


2x2 TABLES. A NOTE ON E. S. PEARSON’S PAPER 

By G. A. BARNARD 

As Prof. Pearson has kindly shown me the proof of his paper, I should like to make the 
following further remarks. 

1 . If we have a sample of N from a population in which there is a chance p that an 
individual will have a character A , we can represent it in the form 

X 2 , • • • , • • • , X N , 

where is 1 or 0 according as to whether the ith member has A or not.* Regarding the 
;i’s as quantitative variables, we have by classical results the unbiased estimates 

ft — x. = (Ux^/N and a 2 = (2J(x { — x. ) 2 )/(N — 1). 

If r of the x’s are 1, while s are 0, we find 

ft = r/N and a 2 = rs/N(N - 1). 

Using this unbiased estimate of variance in Prof. Pearson’s para. 43, we get, instead of his 

d a — rmlN 


N\N-\) 

agreeing exactly with his (22). 

2. To carry the argument further, in classical theory, if we have two samples 
(x v x 2y ...,x i9 ...,x m ) and (y v y 2 , y j9 ... ,y n ) 
to test whether the samples come from the same normal population we take 


x . — // . / m n 


m + n 


where x . = y . — (lyj)/n, and 


m + n — 2 ’ 

and use tables of the t distribution for (m + n — 2) degrees of freedom. 

It is common practice to neglect departures from normality in applying this test. If we 
do so, and apply it to our qualitative case along the lines indicated above, we get 

f _ a — rm/N 

y acn + bdm ’ 

N{N- 2) 

which, if we are justified in our neglect of departures from normality, should be distributed 
as t on (N— 2) degrees of freedom. 


For a similar argument see B. L. Welch (1938, p. 155). 



G. A. Barnard 169 

3. To obtain the formula (1) on these lines, we have in effect to commit the well-known 
fallacy of replacing a 2 as given by (2), by 

m+n—1 ’ ' 

where m! = (£x i + £y i )/(m + n). 

We are led to ask why (3) should be approximately oorrect (and in faot it is better than (2)) 
in the qualitative case, while (2) is preferred in the quantitative case. 

4. The simplest reason for preferring (2) to (3) in the quantitative case is that a' 2 is not 
independent of (x.—y.), so that the conditions for validity of the t distribution are not 
satisfied. In our qualitative case this argument loses validity, since neither a 2 nor a' 2 is 
independent of (x.—y.). 

The second reason for preferring (2) to (3) in the quantitative case is more complicated, 
but for our purposes it reduces essentially to the fact that, in the case of normal distributions, 
and only in this case, the mean and variance of samples are independently distributed, so 
that the common mean value of the populations, estimated by to', is irrelevant to the test 
for differences. In our qualitative case, on the other hand, to' contributes to our knowledge 
of the variance. 

5. If we apply Pitman’s ‘absolute’ analogue of the t test to our case, we arrive at the 
hypergeometric series of Prof. Pearson’s Problem I. But Bartlett’s argument, showing the 
convergence of Pitman ’s test and the t test, will apply here only in very large samples, because 
of the finite probability of obtaining observed values which coincide. 

6. From the above point of view, Prof. Pearson’s analysis of his Problem II may be 
regarded in one sense as an examination of the effect of large departures from normality on 
the t test. In this light, his conclusions given in paras. 51 and 52 are seen to extend to the 
t test, as well as to the 2x2 table problem. 

7. If 1 may state my personal attitude, it is that statistics is a branch of applied 
mathematics, like symbolic logic or hydrodynamics. Examination of foundations is 
desirable, but it must be remembered that undue emphasis on niceties is a disease to which 
persons with mathematical training are specially prone. In pure mathematics itself there are 
disputes on foundations which closely parallel the disputes over the foundations of statistics. 
The lesson to be drawn is, that while statistics is a most valuable aid to judgement, it cannot 
wholly replace it. 

8. Finally, it must be emphasized that the order of printing of Prof. Pearson’s paper and 
my own reflects Prof. Pearson’s generosity rather than the historical order of events. Much 
of his paper was, unknown to me, given in lectures before the war; whereas my work on the 
problem began only in 1943. Since then I have owed much both to Prof. Pearson’s published 
work and to discussions which I have been privileged to have with him. 


REFERENCE 

Welch, B. L. (1938). Biometrika, 30, 156. 



t 170 ] 


THE CUMULANTS OF THE Z AND OF THE LOGARITHMIC f 

AND t DISTRIBUTIONS 

By JOHN WISHART 
School of Agriculture, Cambridge 

Explicit expressions for the exact cumulants of Fisher’s z-distribution do not appear ever 
to have been published. They were therefore worked out, and appear in § 2 of this paper. 
It afterwards appeared that the logical method of presentation was to deal with the similar 
problem for \ log (y 2 /w),* since the z-distribution involves the simple difference of two suoh 
functions which are independent. This led to § 1 . Since writing this paper, Bartlett & Kendall 
(1946) have published the same result in the form of the cumulants of logs 2 , and have given 
graphical and tabular representations for varying n up to 20. The solution is, of course, 
implicit in Cornish & Fisher’s (1937) statement of the moment generating function, while 
Mr C. R. Rao has informed me that he reached the same result in work done for an M.A. 
Thesis of the University of Calcutta (unpublished). §1 has accordingly been shortened, 
but is retained in view of the additional formulae to those of Bartlett and Kendall. 


1. The logarithmic y 2 distribution 
The distribution of y 2 , for n degrees of freedom, is given by 

1 


T(\n) 


(k 2 )*”- 1 e-tx'd(tx s h 


As pointed out by Cornish & Fisher (1937), the mean value of exp{Ii<log(y 2 /n)} 

i.e. of exp {\it log (|y 2 ) - tit log (*»)} 

is the moment generating function of the distribution of § log (y 2 /»), namely 

ftin + it) 


M = 


rm 


- exp {-p log (In)}. 


The cumulant generating function is 

K = log M - - ti f log ( \n) + log P\{n + it)~ log ran). 

The cumulants of the distribution of h log (y 2 /n) are readily written down by differentiating 
K successively with respect to it and at each stage putting t = 0. We have in fact 


and 


« — £ log (£») + log I'an) 

= ~ g jl°g o + Lt,^^#, a) ----])} 
k s = £(«, a) (s > 1 , a = £»), 


( 1 ) 


where £(8, a) denotes the generalized Zeta-function 


£(*>«) = £ 

(« +jr 


* All logarithm** in this paper are to baae c. 



171 


John Wishabt 

The cumulants may be readily computed by throwing them into the form 

2 *i = ^(i»)-log(£n), 

2 g K, - n), (2) 

where ^(x) = d{log r{x)}/dx, ^*~ u (x) = d*{log r(x)}/dx*. 

^(x) is variously called the Psi or Digamma funotion, and its derivatives have been called 
the Trigamma, Tetragamma, etc. Functions, and the series the Polygamma Functions. 
These functions have been computed in some considerable detail. For n up to 22 the mean 
and variance can be got from Elinor Pairman’s ‘Tables of the Digamma and Trigamma 
functions’ (1919). Tables up to Pentagamma appear in Vol. I of the British Association’s 
Mathematical Tables (1931), but with certain gaps which, although intended to be bridged 
by reduction formulae, render the tables less generally useful (for n less than 22) than 
H. T. Davis’s Tables (1933, 1935). Table 10 of Vol. l gives all that is required for ^(x); in 
Vol. ix, Tables 14-16, 18-20, 22-24 and 26-28 cover a wide range up to Hexagamma. 

As shown by Bartlett & Kendall (1946), the approach to normality is very slow. For 
n — 24 (the limit for n x of the z table of Fisher & Yates (1943), which provides percentage 
points for the distribution under consideration in the line » 2 = oo) the cumulants have been 
worked out to /c 6 , the last being specially computed from its formula given below. The gamma 
ratios are y l - —0-295, y 2 = 0-174, y 8 = -0-154 and y 4 = 0-175, and \ y\ increases there- 
after at this level of n instead of tending to zero. Approximate percentage points may, 
however, be worked out by using the formulae at the foot of the z table, putting n t = oo. 

For small n, we note that 


1111 


i- if 

£(«).(! -2-) 


ssr — 4- -• 4- — “I - ■ « 
1' 2« 3* 

' + (r- 

1 1 1 

~ — -j- — *4- - - -4- . , 

P 3* 5* 

.. +00 

1 1 1 
“ P + 3» + 5* + " 

" + W 


f + £(s,r) ran integer, 


, + 2-*£(s,r + *). 


We thus get, for n — 2 r, 

(_ nt 


H " { l + & + ¥ + ■ ■ + ( h - 1 y*) j (S>1) ' 


in which the terms in {. . .} reduoe to £(«) for n = 2. 
For n - 2r + 1 




in which the terms in {...} reduoe to £(«) (1 — 2~*) for n = 1. 
In the special case of s — 1 we have 

» = 2r = -i(y+iogi»)+!(i + g+!+ — 

n = 2r+l k x = -|(y + log2n) + /l +^ + g+ ••• + 


+ £n-l)’ 

_u. 

n-2 


( 3 ) 

(4) 


(5) 


For » = 2 and 1 respectively these expressions reduoe to the first bracket. £(«) can be got 
from tables, and in particular 


£(2m) = 2* ml n in> B„J(2m ) !, 



172 The cumulants of the z and of the logarithmic x a nnd t distributions 

where the B' s are the Bernoulli numbers B l — B 2 — B z = B i = fa, B 5 = eto. 
y is Euler’s constant. For reference we may quote: 

y = 0-57721 56649, £(4) = 1-08232 32337, 

£(2) = 1-64493 40668, £(5) = 1-03692 77551, 

£(3) = 1-20205 69032, £(6) = 1-01734 30620. 

Note that Lt^^a, a) - j - y + S # - j . (#(«) > °)- 

For large n, asymptotic formulae for the Zeta-function may be used, and we get 

]. y 

1 2 » _ 

11 1 8 8 64 

2n 6n 2 + 15w 4 63n 6+ l5n* 33w IO+ ‘"’ 




2 »*~i 2ra" 


->B i (2j + a-2)n 

(2j)!n^' J 


(«>!)• 


We may note in passing that not only may this general expression for k s be applied to the 
special case a = 1 with the proviso that the first term in that case is dropped, but also that 
k 2 may be obtained from x l + |log(4ra) by term-by-term differentiation with respect to n, 
and likewise k 3 from /c 2 , K t from k 3 , etc., by similar term-by-term differentiation. This follows 
from a property of the Zeta-function. It is therefore not necessary to write down the explicit 
expressions for #c 2 , /c 8 , etc., but we may note that their leading terms are |» l , - |»~ 2 , n 3 . 
— 3w -4 , 12»~ 5 , etc., so that the leading terms of y t and y 2 are — \l(2/n) and 4 /» respectively, 
while y r is 0(n ~* r ). More exactly we have, writing »' = »—!, 


K * 2n' ( 1 3n' 2 + IM* + ) 


3n' 2+ i5»' < 

with corresponding expressions for /c 3 , K i , etc., obtained by differentiation with respect to 

(;<))• 

Finally, if instead of the distribution of |log (x 2 ! n ) we are interested in the distribution of 
log (a a ), where « 2 is an estimate of cr a based on n degrees of freedom, we have 

Jog ( X 2 l n ) = ] og * 2 - lo g ^ 
and thus for the distribution of log (s 2 ) we have 


and 


‘■■ iog (?) +2 *, i ° g/ ’ ,w 

K, = (-!)•(«- !)!£(*, a) («>l,o = iw), 


while the y ratios are the same as for £log (y 2 /w). Obviously log y and log a can be treated 
similarlv. See Bartlett & Kendall 119461. 



John Wishabt 


173 


2. The * distribution 


The distribution of z— Jlog («?/sf), where sf and sj are independent estimates of a variance 
<r 9 , based respectively on and v a degrees of freedom, is obviously that of 

\ logOflM-Jlogtaj/v,) 

and its cumulants may therefore be at once derived from those of the logarithmic x* dis- 
tribution. The cumulant generating function is 

k = log if = \it log (i' 2 /i' 1 ) + log r*K + it ) + log r$(v 2 - it) - log - log r(±v t ). 

Further, we have 

= jiog^+^-iogAi^)- ^iogr(K) 

= ^j>og^+Lt,^ 1 (f(s,a 2 )-S(#,o 1 ))j (7) 

*, = 2-*(s - 1 ) ! {£(*, o„) + ( - 1 )* £(«, o a )} (s > 1 , aj = $v v a 2 = $v t ). 

For computing purposes these may be thrown into the forms 

2*i = log (pjvi ) + ^(Jv,)-^(^ 2 ), 

2V, = ^" 1 >(^ 1 ) + (-l)*^ 8 - 1) (W (*>1). (8) 

To illustrate, let us take \\ = 24, v 2 = 60. We then have from the Polygamma tables (except 
for which was specially computed): 

K x = - 0 01 27 429, k z = - 0-0007 998, /c 5 = - 0-0000 104, 

k 2 = 0-0301 992, k a = 0-0000 867, x 6 = 0-0000 019, 

(T = n /at 2 = 0-1737 792, 


y t = - 0-152, y 2 = 0-095 (or = 0-023, y? 2 = 3-095), indicating the degree and nature of 
the departure from normality. y 3 and y 4 are - 0-066 and 0-067 respectively. 

If as a first approximation we assume that for p 1 and v 2 of the order of the numbers chosen 
in this example, or higher, z iR distributed normally with mean and variance given by the 
above formulae, we obtain approximate percentage points, e.g. for the 95 and 5% points 
we can subtract and add l-6449<r from and to the mean. The result in the present case is to 
give us — 0-299 and 0-273, the correct values being -0*306 and 0-265. The approximation is 
adequate to almost two figure accuracy, and is evidently useful when we only require to 
know whether an observed z is significant or not. A better approximation is provided by 
the formulae attached to the z tables (see Fisher <fe Yates (1943)), which yield —0-3045 and 
0-2653 as against the correct values of -0-3055 (see Thompson (1941)) and 0-2654. 

Explicit algebraic expressions are readily written down for the cumulants for small v x 
and r 2 , using the same method as for the logarithmic x 2 distribution. Where it is necessary 
to do so, v x will be assumed less than v 2 . In the contrary case we need only interchange v x 
and changing the sign of the odd cumulants in so doing. The odd cumulants are zero when 
v t — v 2 . We have 

Even cumulants (r = 2a, s > 0) 

»'i = 2 p, v a — ’2q 


<r 2 ( r 1 )'[^( r ) 2^ (y + 4r + -" + (v l -2f) 2 + + 2) r + ' ‘ ' + (^ - 2)')] ’ 



174 The cumulants of the z and of the logarithmic x 2 a,nd t distributions 

= 2p + l, v 2 = 2q+\ 

K r = 2(r-l) !^(r) ( 1 - 2-*) - ^ + . . . + ^ “ 2 + (i£ + 2? + ‘ ' + * ( l °* 

v x = v 2 . Drop out the last braoket of terms in the above two oases. 

)'i = 2p, v 2 = 2 q+ 1 

K r = (r - 1 ) ! ^(r ) -( 2r + 4,+ "- + 7) ~ ( 1 + y + 5' + * " + Ov^)] ' 

— 2p + 1, y 2 = 2 q. Interchange v l and v 2 in this last case. 

Odd cumulants (r = 2s + 1 , a > 0) 

i> x = 2p, j> 2 = 2q, or = 2p+ 1, v 2 = 2q+ 1 


( 11 ) 


Kr (r 1 ) ! [i/j + (i/j -f 27 ‘ ‘ + (i» 2 — 2) r J 


»>i = 2p, v 2 = 2^+1 

k t = (r- 1)! ) < 1 “ 21 r ) + (^ + 4*- + ' " + ( i-j - 2 )>■) " ( 1 + 3 r + 5 r + 






(12) 

(13) 


= 2p + 1, v 2 = 2(7 . Interchange j> x and v 2 in this last case, and change the sign of K r . 
In the special case of s = 1 , we have 
i’i = 2 p, v 2 = 2 q, or ^ = 2p + 1, v 2 = 2q + 1 


• + , t lr 2 )- 

v x = 2p, i> 2 = 2q + 1 

*, - * hw (*) + >«R a + (| + 1 + - - + ^-La) - ( « + i + i + - + rr L-j) - 

= 2p + 1, r 2 = 2^. Interchange ^ and r 2 in this last case and change the sign of k v 
F or large v x and v 2 , a combination of the asymptotic formulae already given readily yields 
the following results: 


(14) 

(15) 


2 \i’ 2 V 7-1 J \‘T v V' 


the numerical coefficients being as for the k x of \ log (x 2 l n )> 
(s — 2)\( 1 , (-l)^ + («-l)!/l ,(-!)■ 


h 1 + (- i _n. 

Vr 1 ►i" 1 / 


/ 

(-4p-»J? y (2j+«-2)!/ 1 (-1)*\ 


+ 2 2 

J 


(2j)! 


(16) 


We may put « = 1 in k s provided we drop the first term. We note also that k 2 and higher 
cumulants can be written down immediately by differentiating the terms in v s and i’j of 
x x — | log (vjvj) successively with respect to — v 2 and » t respectively. ‘ 

These are the results given by Cornish & Fisher (1937), whose formulae can be extended 
at sight by means of the results of this paper. A first approximation not only gives the 
familiar results j/j j j , j j % 

(s-2)!/ 1 ,(-m 


2 


vr 1 ir 1 /’ 


but also the more general 



John Wishabt 175 

but it should be noted that for all a > 1 a second approximation, whioh takes in an additional 
term, is (« — 2)1 / 1 (-!)» \ 


!( L 


The aoouracy of the asymptotic approximation at the limits of the z table given by Fisher 
& Yates (1943) can be seen by applying it to our example (v x = 24, v % = 00). The numbers 
of terms which are significant in the eighth place (needed for final accuracy to 7 decimal 
places), are three for k v four for Xg and x 3 , and three for x 4 , x 6 and x 0 . The first term for x e , 
namely 12 (j^~ b + B ), yields 0*0000 015, rather more than 20 % too low. To use 

1 2{(i^ 2 — 1) 5 + (J'i-!)- 6 } 
would give 0*0000 019, about 2 % too high. 

Should v x or v % be only moderate in size, the other being large, we may make use of the 
relation j I j 

a °’ a) - a- + ia + Yr + - + (^73T). + ^> a + r )> < r an j nte 8 er )> 

where a is one-half of the smaller of v x or p % , to convert our formulae into forms in which 
asymptotic expansions may be applied to both of the Zeta-functions. We then have ( v x < y 2 ): 

K ‘ - 1 1 10 * 0 + u - - «*• w- - «*• +r »] - (,!, + + • •• + rTTF^s) • 

*• -<*-"(* -<«*• K) +<-')■ iv , - + ( - . v {i + ~ + ... + ( - KT dr-vQ ■ 

(17) 

and f («. n) ~ (a _ , ) n , i + ^ + ( 8 - 1 )'!*•£, “(2j j t^ 1 

Particular cases of some interest arise (i) when r — £(y 2 — ^i)» v \ an d ^ being either both odd 
or both even, and (ii) when r — J(y 2 — v i + 1)» v \ (or ^ 2 ) being even and v 2 (or j^) odd. In the 
former case the first term within squared brackets in k 8 is 2 ~*(1 + ( — 1 )*)£($, \v 2 ), which is* 
zero when s is odd and 2 1 ~*£(a, ^ 2 ) w ben 8 is even. In the latter we have 

2-*{C(«,K)+(-»)*^.i(»'2 + »))} 

which is £(a, ^ 2 ) when a is even. With a odd we are concerned with the difference of two Zeta- 
functions in which the a’ a differ by one-half, and the expression may be written 


£ (-1)* 

1 [} V 

-1 Qt> 

V 

(-i)' 

,~.(>',+jj* 

(a - 1)! \dv 2 l 

J-0 

>’2 +j 

v = 

r t x y * 1 di r 



J - II 1'j +j 

Jo l +-r 



1 

v 

i-o 2^ +1 (i’ 2 +j)! 

on 

integration 


= 1 V (- 1 ) , '; 1 ( 2 4/ - l)B } 

2i' 2 s~i 2 j v$ 

on expansion in powers of pf 1 . This asymptotic expansion is an interesting one in which the 
early coefficients are very simple, for the series is 

1 J L 1 _ LLj 

2~v, + 4v! 8vS + 4»| 16v| + 


( 18 ) 



176 The cumulants of the z and of the logarithmic x 2 and t distributions 

The various oases are set out below: 


Even cumulants (r = 2s, s > 0) 

== 2 p, v 2 = 2q, or — 2p + 1, i> 2 = 2q + 1 

* r = (r-1)! t; + (v + 2) r + ‘ ' " ■ + (v a — 2) r ) 

, (r • -2) ! ! (r -1 ) ! I _ 1 - ( - 4)* B,(2j + r -2) ! 

= 2p, p 2 = 2q- + 1 

*,-o- i) i{i+ r ^+. -+<^} 

, (r-2) ! , (r^l) !__1 * ( - 1)* B, (2j + r - 2)! 
*T l 21^5 VJA (2j) ! vl#- 1 

Odd cumulants (r = 2s + 1, a > 0) 

= 2 p, v z = 2g, or v t — 2p + 1 , »> a = 2q + 1 

Af r = - (r - 1 ) l{- + ~ + — + . . . + { ~_ 2j- r ) , 

i>i = 2p, v 2 = 2 5+ 1 


In the special case of 8 — 1, we have 
Vj = 2 p, v 2 = 2g, or i>! = 2p + 1, v 2 = 2q + 1 


(r-i)! 1 » ( — 1 V M2 2; — 1 ) Bj(2j + r — 2) ! 

2*^2 »5/-i 


(2/)!if- r 




v, == 2p, v 2 = 2q + l 

,, (v t \ / 1 i i \ t - (-iy-M2«-i)B y 

** ~~ ‘ ° g W ( I'l + 1', + 2 + " ' + 1' 2 - l ) + 2i' a + j" 2jif 


(19) 


( 20 ) 


( 21 ) 


( 22 ) 


(23) 


3. The logarithmic ^-distribution 

When Vt — 1, z = logtf, and we thus have as a special case for the distribution of log/ for 


i >2 = n degrees of freedom: 

2*i « log » + Lt^i {£(«, in) - C(s, l )} (24 ) 

= logTC + <l/(\)-*l'(in) (^(1) = -y-21og2). (25) 

/’or small n 

n = 2p Ki = ilog»-log2-Q + i-t-...+^— 

» = 2p+l ^ = ilog»-(l+^ + ^+...+-~j. (26) 

For large n 

*x~ - J(y+ log2) + 2 -+ J - (27) 



John Wishart 

Also 2*. = (»-l)I {£(*,*») + (-1)* £(«,*)} (»>1), 

- vfr^W-M- « (-!)*(»- 1)! (2*— i) £(«))• 

For small n we have the following cases: 

Even cumulants (r — 2s, 8 >0) 


„ = 2p * r =(r-l) ! {s(r)-(i + i + . 

+ 1 w 

(tt-2r/r 

» = 2p+ 1 K r = (v* — 1 ) ! |2^(r ) ( 1 — 2~ **) — ( 

, 1 + 3 r + 5' + - 

Odd cumvlants (r = 2a + 1, * > 0) 

»=2p K r — —(r— 1)! |c( r ) (1 — 2 1_r ) + 

1\ 1 

(^ + 4r + - H 

n — 2p+ 1 A:r = _( r _i)!^i + I + i- + ... 

+ 1 \ 

+ ( w ~ 2) r / 


For large n 
K, 


177 

(28) 

(29) 


(30) 


<-!)•<•- 1) i «•) (i - 2 -) + ^ + :» S 


(a — 2) ! , (a - 1 )1 2 

2w* 


(31) 


t n 2y-l 

(32) 


(2j)!n 2 


In the special case of w. = oo we have for the distribution of log x> where a: is a normal variable 
with zero mean and unit standard deviation: 

*1 - - *(y + log 2) f = ( - 1 )' (s - 1 ) ! £(*) ( 1 - 2-), (33) 

as follows also from the case of £ log (x 2 /n) on putting n = 1 . 


4. Note on the x 2 distribution approximation 

Fisher’s result that <J(%X 4 * 2 * ) * 8 * 10 approximately normally distributed about a mean of <J(2n — 1 ) 

with unit variance (n being the number of degrees of freedom) is well known. The demon- 
stration depends on showing that the mean value of x is 

k x = <J2 rk(n + 1 )jr(\n) ~ J(n — \) for large n 
and that the variance is n — ~ J , 

but to this order of approximation it is not possible to show that y x and y a tend to zero with 
increasing n. A formula for the ratio of the two Gamma functions, developed as far as terms 
in n 3 (see Wishart (1925)), gives y l ~(2n)-* and y 2 = 0(n~ 2 ) (see, for example, Kendall 

( 1 945)), but owing to the vanishing of the term in n~ l of y 2 its leading term has so far not been 
accurately obtained, although the exact (but somewhat complicated) expressions for the 
/3 X and /? 8 of the distribution of s = o’xNi 71 + 1 ) were given in an editorial in Biometrika (1915), 

10, 522. 

Since + 1 ) ~ + js > 


by the formula given in § 2, we find on integration, and insertion of the appropriate constant. 


that 


and thus have 


log /l(n+l)-lo g r(in) = \ log (|») + S 


i-i 


i - jyj ??-. 1 )*/ 

2j(2j- 1)» 


2f~l * 


/?(»+! ) 

mn) 


. i I, “ (-iy-»(2w-i)B; 
= »exp TB (i + 2 - 


(34) 


Biometrika 34 


12 



178 The cumulante of the z and of the logarithmic \ 2 <wd l distributions 

which can readily be expanded to give the additional terms necessary to enable the cumu- 
lants of x (or of ^J(2x i )) to be worked out (see Johnson & Welch (1939)). Taking — £)} 
as the first approximation we find 


/!(»+!) 

~n>) 


7(V) exp {ispt 1 + i + i“35P + -)} 

y(v^( i+ r^i-T)) + 


(35) 


thus providing a second approximation to the ratio of two Gamma functions differing by 
one-half. The cumulants of ^/(2y 2 ) are 




< 3fi > 

80 that * - + + 0( ” ”>• 

The Editorial in Biometrika (1915), 10, 523 calls attention in a footnote to ‘Student’s' 
approximations for the fi x and fi 2 of the sample standard deviation. The above formulae 
show that 4 Student’s ’ results should be 


A-s('+£+eH-^ 

A- 3 ( 1+ i + s») +0( ”‘ ) - 

in which n is now the size of the sample. For n = 10 these give values too low by 2 and 5 
respectively in the fourth place of decimals. Practically four-figure accuracy can be attained 
with n as low as 10 if in the terms in n ~ 3 we replace 31/8 by 17/4 and 7/8 by I . 


REFERENCES 

Bartlett, M. S. & Kendall, D. G. (1946). J.R . Statist. Soc. Suppl. 8, 128. 

British Association (1931). Mathematical Tables , 1, 42. London: Cambridge University Press. 
Cornish, E. A. & Fisher, R. A. (1937). Rev. Inst. Intemat. Statist. 4, 1. 

Davis, H. T. (1933, 1935). Tables of the Higher Mathematical Functions , 1, 2. Indiana: Principia Press. 
Fisher, R. A. & Yates, F. (1943). Statistical Tables , 2nd ed. London: Oliver and Boyd. 

Johnson, N. L. & Welch, B. L. (1939). Biometrika , 31, 216. 

Kendall, M. G. (1945). Advanced Theory of Statistics , 1, 2nd ed., § 12*7. London: Griffin and Co. 
Pairman, E. (1919). Tracts for Computers , no. 1. London: Cambridge University Press. 

Thompson, C. M. (1941). Biometrika , 32, 168. 

Wishart, J. (1925). Biometrika , 17, 68, 



[ 179 ] 


THE MEANING OF A SIGNIFICANCE LEVEL 


By G. A. BARNARD 


A level of significance is a probability. To say that a given result is significant on the 5 % 
level means that some class of events has probability 0-05. Now whatever theory we may 
hold as to the nature of probability, in order to give a statement of probability a precise 
meaning we must refer to some reference class, or set of data, on which the probability is 
calculated. What is the reference class involved in a level of significance? 

To many people the answer to this question seems simple enough. The reference class 
involved is the set of indefinite (possibly imaginary) repetitions of the experiment which 
gave the result in question. Otherwise put, the data, on which the probability is calculated, 
are the external conditions of the experiment. The following example indicates, however, 
that the meaning of this reference class is not always clear. The example is a modified form 
of one given by Prof. R. A. Fisher in a letter to the author. 

Suppose we have a bag of chrysanthemum seeds, known to give plants having white 
flowers or plants having purple flowers, no other colours being possible. We suspect that 
the proportions of white and purple seeds are equal, and to test this hypothesis we select 
at random ten seeds from the bag, and plant them. Nine of the plants grow to maturity, 
and all of them have white flowers. On what level of significance can we reject the hypothesis 
of equality of proportions ? We may assume that white and purple plants are equally viable. 

It would be natural to argue that, if white and purple flowers were equally likely, the 
probability of our result would be 1/2*. If there is no reason to suspect an excess of white 
rather than an excess of purple flowers, we must add to this the probability of getting nine 
purple flowers, which is also I / 2*, giving a total probability of 1 /2 8 . The hypothesis of equality 
of proportions would then be rejected on the 1/256, or the 0-3906% level of significance. 
But if we did this our reference class would not be the set of indefinite repetitions of the 
experiment, in its ordinary meaning. 

A repetition of the experiment, in its ordinary meaning, would consist of another selection 
of ten seeds from the bag, and their planting and growth. On such another occasion all ten 
plants might grow to maturity, or all or some might die. These possibilities have not been 
taken into account in our calculation of probability, so far. 

To allow for the possible variation in the number of plants which grow, we might lay out 
the set of all possible results of the experiment as in Fig. 1, where n denotes the number 
of plants that grow, and r denotes the excess of white over purple. Thus any point in the figure 
can be referred to uniquely by its co-ordinates (n, r). If we now introduce a parameter p, 
to denote the probability (if it exists) that a plant will grow to maturity, given that it has 
been selected, the probability associated with the point (», r) on the hypothesis of equality 
of proportions of white and purple will be 


W(n,r\p) 


1 . 0 ! 

w!(10 — n)\ 


p n ( 1 — p) 10 ~ u 


?i ! 2 ^ 

G(» + rjjT^-r))!’ 


and since this is a function of the unknown p, we have a special problem of arranging the 
points («, r) in order of significance before we can establish a test. The situation in this 
respect is similar to that dealt with in the paper on 2 x 2 tables, printed earlier in this issue 
(Barnard, 1946, pp. 123-38 above). 



180 


The meaning of a significance level 

Proceeding as in the earlier paper, we notice first that the same level of significance must 
apply to (ft, r) as to (ft, — r), so that we can confine our further considerations to the upper 
half of the diagram. Now in this half, the transition from (ft, r) to (n + l,r + 1) means we 
discover that one of the plants which failed to grow in our case, was in fact a white-flowered 
plant. In this case our conviction that there is an excess of white-flowered plants would be 
strengthened, so that (n + 1 , r + 1 ) would be reckoned more significant than (ft, r). Similarly, 
going from (ft, r) to (n + 1, r — 1) would mean that a missing plant was found to be purple, 
and this would weaken our belief in an excess of white-flowered plants; consequently, 

10 

9 

8 

7 

6 

. . . 5 

4 

3 

2 

1 t 

. . . . . . 0 r 

- 1 4 

- 2 

- 3 

. - 4 

. . . — 5 

. - 0 

. . — 7 

. - 8 

- 9 

. - 10 

0123456789 10 

n 

Fig. 1 

(ft, r) would be reckoned more significant than (ft+ 1, r - 1). Finally, going from (ft, r) to 
(7i + 2, r) would mean growing two more plants, one purple and one white, and this would 
increase our tendency to believe in the equality of proportions. Consequently, (ft, r) would 
be reckoned more significant than (ft + 2, r). These principles taken together imply that 
points lying north-east, or west, of a given point (ft, r), or between these two directions, 
would be reckoned more significant than (ft, r); while, conversely, points lying east to 
south-west (inclusive) from (ft, r) would be reckoned less significant than (ft, r). The relative 
significance of points lying inside the half-quadrants north-east to east and south-west to 
west would remain undetermined. 

We could now proceed as in the paper (1), building up a test, consistent with the above 
partial ordering, in such a way as to make the significance or otherwise of our result depend 
as little as possible on any knowledge we may have about the value of p. But we need not 
carry this through for the result we have quoted, since our conditions by themselves require 
that the only points in the diagram which should be reckoned not less' significant than our 
result are the points (9, 9), (9, — 9), (10, 10) and (10, — 10). The probability associated with 
these four points is 

P(9,9 ;p) = 2(10p 9 (l — 2>) .2~ 9 +p l0 2 10 ) 

= (p/2) 9 (20-19p), 

the maximum value of which occurs when p = 18/19, and is P m { 9, 9) = 0*002413. Thus on 
this basis we should conclude that our result was significant on the 0*2413 % level. 



G, A. Babnabd 181 

The difference between the first result, 0*3906 %, and the second, 0*2413 %, is in practice 
negligible. Somewhat larger differences will be found in other similar cases, however, and 
it seeing worth while to try to clarify the cause of the discrepancy. 

Consider three possible causes for the failure of the tenth plant to grow to maturity: 

( 1 ) The bag from which the seed was taken is known to contain a proportion of dead seeds, 
which are physically indistinguishable from the live ones, and the tenth seed planted 
happened to be one of these. The conditions of growth were such that any live seed planted 
would have grown. 

(2) The tenth plant happened to be attacked by a soil pest, which destroyed it. 

(3) The statistician trod on the tenth plant while running for a bus; otherwise, it would 
have grown. 

If we now consider what would happen in these three cases if the experiment were 
repeated, in case (1 ) we should be just as uncertain as before how many plants would grow, 
out of those selected. In case (2), we might or might not happen to strike a good year for 
the pest in question, so that we might or might not have a similar accident recurring. In 
case (3) we should obviously give the statistician firm instructions not to be careless, and 
then we could be reasonably certain that all the plants selected would grow.* 

In the first case, we can suppose that the proportions of white, purple, and dead seeds in 
the bag are, respectively, p v p 2 , and 1 - (pi + /> 2 )^ an d the purpose of our experiment is to 
test the hypothesis p x — p 2 . In this case, putting Pi+p 2 = P , we can clearly apply the 
analysis of Fig. I, and the appropriate level of significance is 0*2413 %. 

In the third case, the situation actually realized is just what it would have been if we had 
warned the statistician beforehand, and then thrown one of the ten seeds back into the bag. 
Thus our effective sample size here is 9, and the appropriate level of significance is 0*3906 %. 

In the second case, the answer depends on our attitude to the set of accidents of which 
the pest is a specimen. If this set of accidents is regarded as a stable set of chance causes 
we may be justified in representing its effect on the growth of our plants by the prob- 
ability p. If, on the other hand, the incidence of such pests undergoes, say, regular cyclical 
fluctuations from year to year, so that its incidence is to some extent predictable, if not 
wholly controllable, then we should not be justified in assuming the existence of a real 
probability corresponding to our parameter p. We should, to be on the safe side, in this case 
allow for the possibility that experimental technique might improve in the future, to such 
an extent as to eliminate the possibility of such accidents. Thus, adopting this conservative 
attitude to our results, we should here treat the effective sample size as 9. The repetitions 
of the experiment which we have in mind would then be imaginary repetitions, in which 
experimental technique was supposed to be better than it is now, and we have as much 
control over pests as we have over statisticians. 

The general situation illustrated by this example can be described in terms of the notion 
of ‘isolate’ introduced by Prof. H. Levy (1931). In making an experiment, we try to 
construct an isolate — a system, or part of the world, which we suppose has relatively little 
interaction with the rest of the world, and which, for practical purposes, may be considered 
on its own. This isolate may contain within itself all the systems of chance causes which are 

* It is not suggested that the three oases exhaust the multiplicity of types which might arise in 
practice. As Prof. Pearson has pointed out, if it were not the statistician, but his three-year-old son 
who was the vandal in case (3), we should have here a situation intermediate between our second and 
third instances. 



regarded as affecting, to any practical extent, the results of the experiment. Such is the case 
in (1), where all the chance causes involved in the experiment are supposed given in 
the bag which is the subject of the experiment. Here, then, we are dealing with a ‘good 
isolate’, whose interaction with the rest of the yorld is really negligible, and chance oauses 
operate within theusolate. 

In case (3), on the other hand, we are dealing with an imperfect isolate. The outside world, 
in the shape of the statistician, interacts with our isolate to an extent not negligible in 
practice. Fortunately, in this case, we are able to construct a smaller isolate, consisting of 
the nine surviving plants, in which the interactions with the outside world are negligible. 
In case (2), there may be some doubt as to what isolate we are discussing. If we regard soil 
pests and such things as included in the isolate, and represent them as a stable set of chance 
causes, then we are entitled to analyse as in case (1); but if the pests are not included in the 
isolate, we should analyse as in case (3). , 

Statistical tests are applicable to at least two types of experiment. First, to experiments 
in which the isolate studied contains within itself a system of chance causes which may 
influence the results. And second, to experiments in which the isolate studied is not a ' good ’ 
isolate, and the residual interactions with the rest of*the world may affect the results. There 
may also be mixed cases. 

The distinction between the two types may also be brought out in relation to the necessity 
or otherwise of an ‘artificial’ randomization procedure, using random digits or the like. 
In the first type, such an artificial randomization procedure is not strictly necessary; for 
example, with our bag of seeds, the bag itself, and its physically indistinguishable contents, 
forms a perfectly adequate randomizer. We have in this case, as it were, an impermeable 
shield around the system, which prevents any external shocks from affecting the system. 
In the second type of experiment, we need to ensure that the interactions with the outside 
world will not mask the results we are interested in : and if we cannot ensure a practically 
complete separation from the outside world, then the effect of external intereactions must 
be randomized, by a special procedure. The randomization here acts like a shock absorber, 
specially placed around the experiment to distribute external shocks evenly through the 
system. 

In the first type of experiment, the reference class to which the significance level applies 
is in fact the set of indefinite repetitions of the experiment in question. In the second type 
of experiment, the reference class is an ideal set, in which the accidental influences of the 
outside world repeat themselves exactly, while the effect of these accidents on the system 
varies as a result of the special randomization. 


REFERENCES 

Barnard, 0. A. (1946). Significance tests for % x 2 tables. Biomdrilqi, 34 , 123. 
Levy, H. (1931). The Universe of Science. London: Watts aud Co. 



Volume XXXIV. Parts III and IV 


December 1947 


ON THE DISTRIBUTION OF THE RANK CORRELATION 
COEFFICIENT r WHEN THE VARIATES ARE 
NOT INDEPENDENT 


By WASSILY HOFFDING 


I. Introduction 

1. Consider a population distributed according to two variates x , y. Two members 
(x x , y x ) and (x 2 , y 2 ) of the population will be called concordant if both values of one member 
are greater than the corresponding values of the other one, that is if 

x i < x 2 , y x < y t or x x > x 2i y x > y 2 . 

They will be caUed discordant if for one member one value is greater and the other one smaller 
than for the other member, that is if 


x i < Vi > y% or x i > x i> Vi < Vz* 

The probability p that two members drawn from the population at random without 
replacement are concordant will be called the probability of concordance , the probability 
q that they are discordant will be called the probability of discordance. . 

In the following only populations will be considered for which the probabilities of x x = x 2 

or Vi = V 2 are zero > s° that p + q= 1. (1) 

The main types of such populations are (a) an infinite population with both x and y 
distributed continuously, ( b ) a finite population where all values of x and all values of y are 
different among themselves. The condition that the two members are drawn without replace- 
ment is, of course, only relevant in case (b). 

For a sample of n members drawn from the population, the probabilities of concordance 
and discordance are defined in the same manner as for the population. They will be denoted 
by p' and q' to distinguish them from the population values. If for the population (1) is 
fulfilled, it may be assumed that all values of x and all values of y in the sample are different, 
so that p’+q'=\. (2) 


It follows from the definition that p f is the relative frequency of concordant pairs among 
the Q j pairs which can be formed from the members of the sample. 


The probability of concordance expresses an essential property of a bivariate distribution. 
It may in itself be considered as a measure of correlation, p* is an estimate of p; it will be 
shown that the mean value of p' is p. If a coefficient lying between the limits — 1 and + 1 is 
preferred, the quantity . T = - q ' = 2p' - 1 (3) 

may be taken. 


2. The quantity p here termed the probability of concordance was * apparently first 
considered by Esscher (1924) who also used the quantity 

1 




n-l n 

S S sign (x t - 
#-!<-/+! 


•assign (Vi-y,), 


Biometrlka 34 


13 



184 


Distribution of the rank correlation coefficient r 

(where x t , y it i = 1 n, are the sample values of the variates) which is the same as the 

coefficient t as defined by (3). Essoher showed that if x and y are normally correlated with 
correlation coefficient r, the expectation of D = r is 

E(t) = -sin -1 r. (4) 

7T 

Hence, from this equation, he suggested estimating r from ranked data by means of the relation 


r = sin- r = sinw (p' — £). 

2t 


For the variance of r Esscher found in the case of a normally distributed population 

1 (n\ . . n-2 

4(2 )var(T)=M + -2- 


where 


li-GHSW 

/2 \ 2 
4 pq « 1 — I -sin -1 r I . 


( 5 ) 


While Esscher saw inp' and D = r only a means for estimating r, Lindeberg (1926) stressed 
the significance of the probability of concordance itself for judging the degree of dependence 
between the variates. He proposed for that purpose the coefficient 


P = 100p' -50 = 50 t, 

called by him Korrelationsprozent. Lindeberg also gave, without proof, a formula for the 
variance of p' in the general case of correlated variates (see (13) below). 

Jordan (1927) suggested using, instead of Lindeberg ’s P, the coefficient later termed by 
Kendall r. 

Kendall (1938), independently of the above authors, proposed r as a measure of rank 
correlation. He completely solved the problem of the sampling distribution of r in a universe 
in which all possible rankings are equally probable, showing that it rapidly tends to 
normality for increasing n. 


3. The main object of this paper is to show that the sampling distribution of p f (and hence 
that of r) tends to normality as n-+co for any population with continuously distributed x 
and y if a certain condition is fulfilled (Part IV). In addition, Linde berg’s formula for 
the variance of p' is proved (Part II) and extended for a finite population (Part V). Finally, 
in Part VI the problem of estimating var (p') from the sample is considered. 


II. Mean value and variance of p' in the case of an infinite population 


4. Consider a sample of n drawn at random from an infinite population with continuous 
x and y. Replace the values of x and of y in the sample by their ranks and arrange the 
members of the sample so that the ranking of x is 1, 2, Then the ranking of y is a 
permutation n « (w lf ...,ir n ) 

of the numbers 1 , . . . , n. 


Let I and J be the numbers of inversions in the permutations (n n) 7r n ^ lf 
(n v ...,7r n ). Then _ 21 , 2 J 

^ n(n— 1)’ ^ ~ n(n— 1)’ 


...,7r x ) and 
( 6 ) 


Thus the knowledge of the permutation II corresponding to the given sample is sufficient 
for evaluating p'. 



Wassily Hoffding 


185 


5. Let P(I1) be the probability of drawing a random sample represented by the per- 
mutation II. Letp'(II) be the probability of concordance for such a sample. Then 

p - 2P(n)p'(ri), (7) 

where the sum is extended over all permutations II of n numbers. 

The right-hand side of (7) is equal to the mean value of p'. Hence 

Ep' = p . (8) 

Consider, in generalization of p\ the probability w' that among tn^n members drawn 
from the sample at random without replacement, certain pairs of members are concordant; 
for instance, among four members A, B, C, D , the pairs A B, A C, AD; or the pairs AB, CD , 
etc. Let w be the corresponding probability for the parent population. Then it is seen in 
the same manner as with p ' that = w ^ 

Thus, if we can express (p'y, the probability of drawing p concordant pairs from the 
sample, replacing each pair after drawing it, by probabilities without replacement of the 
type w\ we can also, in virtue of (9), represent E(p' y by population parameters of the type w. 

6. Now, (p') 2 , the probability of drawing from the sample one concordant pair and, after 
replacing it, of drawing again a concordant pair, is the sum of the following three probabilities : 

(n\ 


(a) the probability of getting the same pair in both drawings ( 1 


, multiplied by 


the probability that this pair is concordant (p'); 

(b) the probability that the second pair has one member in common with the first pair 
n\ 


( 


2(n — 2) 


, multiplied by the probability, say k\ that among three members A , B, C 


drawn from the sample without replacement, one, say A, is concordant with the other two; 
(c) the probability that the second pair has no member in common with the first one 

j * niu lbiplied by the probability that among four members A, B, C, D drawn 

without replacement, two pairs without a member in common, say AB and CD, are con- 
cordant. The latter probability may be denoted by (p 2 )' since the corresponding probability 
for the infinite population is p 2 . 


Thus, 


and, applying (9), 


( 2 ) (P') 2 = P’ + 2(» - 2) k’ + (” 2 2 ) (P 2 )'. 


E(p') 2 = p + 2(?i-2)A:-f 


Hence, we have for the variance of p f 


or 


0 

0 


vftr (]/) — ( ^ I {E(p') 2 — P 2 } = p + 2(n — 2)k — {2n — 3)p 2 


var (p') = [JMp') = p(l-p) + 2(n-2)(k-p*). 


( 10 ) 

( 11 ) 

( 12 ) 

(13) 


This is identical with the formula given without proof by Lindeberg (1926). 

7. In the case considered by Kendall where all permutations II of n numbers are equally 
probable, the permutations of m s* n also are equiprobable. Hence 

p = P(l,2) = g = P(2,l) = J. 



80 Distribution of the rank correlation coefficient t 

Further, representing k as the mean value of k' in a sample of 3, we find 

* = P(123) + iP(132) + iP(213) = (1 + i + *4 = re- 


inserting these values in (13), we have 


. , 2n + 5 

™ (?) = rs^T)' 

var(r) = 4var(p') = 


2(2n + 5) 
9 n{n— 1) 


in accordance with Kendall’s formula. 


III. Some algebraic formulae 

8. We shall now consider some algebraic relations to be used in the proof of normality 
■)ip' for large n. 

Let f d (p) be a polynomial of degree d in p. Then 


s V4: 


where a 0 is the coefficient of the highest power p d in / d (p). 

To prove (14) write f d (p) = a 0 p l<il + a 1 p (d - 11 + . . . , 

where p 101 = 1, p 1 * 1 = p(p— 1) ... (p — 1), (£>1). 

Then (14) follows from the fact that 

/?— o \pj p-s*= o Xp — o; 

s equal to 1 - 1)^ = 0 if > 0 and to J3\ if fi = d. 

9. For any non-negative integer v we may write 

n v = <$o ( n ~ &Y P] + 4? ( n ““ a) 1 * 1 - 11 + . . . 4- 4 “p- i (» - a) + 

We will study certain properties of the coefficients d$. 

From (15) it is seen immediately that ^ (a) ^ 

Inserting in (15) n = oc + /3 (fi = 0, 1, ...) we have 

(*+/})> = d#-,/?! + <>_ A+ . l p*-»+ ... +<>_,/? + <e> 
ff dj?-*. sV'^a}. (/?= 1,2,...). 

Hence we find by induction 

If we take this as definition of d'fi) for k < 0, we have in virtue of (14) 

= 0 for k < 0. 

Expanding (a+p)" we have from (19) 


(20) 



Wassily Hoffding 


187 


Comparing the last sum with (19) and writing 

we have d y ^ v—p—a^ 

or, putting v—fi = k and noting that, by (20), = 0 for <r > k, 

W = ^Qd v ^*°. ( 21 ) 

We have the recurrence relation 

4S¥m- 4£ = (a + F+1 -*)<$>_» (22) 

which can be obtained by multiplying (15) by », then writing down (15) with v + 1 instead 
of v, and comparing coefficients in both expressions. 


10. We prove now two properties of the coefficients d^K 

(I) d yK is a polynomial in v of degree 2k, the term of highest degree being v**/2 *7c!. 

In virtue of ( 1 6) this is true for k = 0. And if it is true for k - 1 , the highest term of d, +ltK — d, 
is, by (22) with a — 0, v 1k ~ 1 /2 k ~ 1 (k — 1)!, and hence that of d„ r , by a well-known theorem, 

I J „2* _ 1 %K 

2k2 k ~ 1 (k— 1)! 2 k k\ ' 

(II) is a polynomial in t of degree 2k with the highest term ( — l) K t iK l2 K K !. 

From (21), 


In ^ the highest term in t is ^ tr. 


In d,_ p _„' K _ a the highest term in t is t**-* (by (I)). 

In (y — t) a the highest term in t is ( — 1 )* tr. 

Hence, in the highest term is 

o .? 0 2'cr!(/c-<r)'/ 2**! (1 2) 1 2 «*! 1 ' 

1 1 . d y v _f has also a combinatorial meaning. 

Let 'LAv) = 2 , Si (v) = S' — j—~ , 

' v x \v t \...v f \ 

where E indicates summation over all ^ 0, S' over all v t > 1 , and in both cases v x + . . . + Vp = v. 
"Lf (y) is the number of ways of allocating v objects on /? places, and E^ (r) is the number of 
wayB of allocating v objects on /? places in such a manner that no place remains empty. 

We have E^ (v) = yff*', 

and a little consideration shows that 

s, w - x; w + (f) s;_x w + • • • + J e( (v). 


Comparing this with (17) we see that 

1 


V / /..\ 


V/ 


v\ 



188 


Distribution of the rank correlation coefficient r 


IV. Proof of normality of p' for n-*oo 

1 2. Any set of different pairs of elements belonging to the population will be briefly referred 
to as a system (two pairs being different if they have no more than one element in common). 

If we represent the elements of a system by points in a plane and the pairs of elements by 
lines joining the points, we have a pattern corresponding to the given system. Two systems 
will be said to have the same pattern if there exists a one-to-one correspondence between the 
elements of both systems such that if two elements of one system form a pair, the two corre- 
sponding elements of the other system also form a pair. Thus the only thing relevant in a 
pattern is the lines connecting the points, the position of the points having no significance. 

A pattern will be called simple if one can pass from any point of the pattern to any other 
one along lines belonging to the pattern. A composite pattern is a pattern consisting of more 
than one simple pattern. 

If the elements of a system (or the points of the corresponding pattern) are denoted 
by different letters A, B, C, ..., each pair of the system can be represented by a pair of 
letters. All systems of one pair have the same pattern (A B). There are two patterns 
of two pairs, one simple and containing three points (A B, BC) and one composite and 
containing four points (AB, CD). There are five patterns of three pairs, three simple 
{AB, BC, CA ; A B, BC, CD; AB,A C, AD), one consisting of two different simple patterns 
(AB, CD,DE) and one consisting of three equal simple patterns (AB, CD, EF). 

13. If a simple pattern consists of points and b } pairs, 

a^bj+i. (24) 

For this is true for b } = 1 , and by adding one pair to a simple pattern, at most one point 
is added if the new pattern is to be simple again. 

Denote the different simple patterns by S v S 2 , ..., where 8 1 stands for the one-pair 

pattern and S 2 for the two-pairs pattern (AB, BC), all Sj with j ^ 3 consisting of three or 
more pairs. Let a i be the number of points and b i the number of pairs in Then a x = 2, 

a 2 = 3 > 6j = 1, 6 2 = 2, 6,^3 if 3. (25) 

Consider a pattern P composed of y 1 simple patterns S v y 2 simple patterns S 2 , etc., and 
containing a points and b pairs. Then, writing symbolically 

p = £rA 

we have a — Zy^aj, b = Ey^6y. 

In virtue of (24), 36 — 2a = E y^(36^ — 2a j) ^ E y^(6^ — 2), 

and from (25) 36 — 2a ^ — y v (26) 

the sign of equality holding if, and only if, pattern P contains no other simple patterns than 
8 t and S 2 . 

14. (p'Y is the probability that p pairs of elements drawn from the sample, replacing 
each pair after drawing, are all concordant. We may write 

(p'Y = ZA { w' it 

where A i is the probability that p pairs are drawn from the sample in such a way that the 
system of different pairs among them has the pattern P { , and if a i is the number of points 
in P it w'i is the probability that if a t elements are drawn from the sample without replacement 



Wassily H5ffding 189 

and paired according to pattern P ( , all pairs of P { are concordant. The summation is extended 
over all patterns P t with no more than /i pairs. 

Since the probabilities w\ are of the type for which formula (9) is applicable, we have 

E(p'Y = 'LA i w i , (27) 

where, as usual, w { is the population probability corresponding to the sample probability w[. 

15. Consider a term Aw in (27) corresponding to the pattern 

p = 2y,s, 

with y = Ey, simple patterns, a = 2 y^cq points and b = Ey^fy pairs. 

Let P = 1,7,8, 

j>2 

be the pattern obtained from P by excluding the single-pair patterns 8 V Then 

y = y-y!, a = a — 2y x and b — b~y l (28) 

are the numbers of simple patterns, points and pairs in P. 

We have w = p? w, (29) 

where v is independent of p and y,, only depending on the pattern P. 


16. The probability A will be studied, in the first place, as a function of n and y„ while 
its dependence on P will be considered later and only in a special case. It must be borne in 
mind that, by (28), y, a and b also depend on y v 

Let Q u Q 2 Q h be the pairs of pattern P numbered in some definite order. Suppose 

pair Qjj appears fip times (ft = 1, . .., b). Then 


+ = p, (i p >\ (ft = 1 6). 

Let P,, R 2 , be the total set of the pairs drawn, numbered independently of the 

order in which they appear. Then p x R’s are equal to Q v /i 2 R’s are equal to Q 2 , etc. 

Let B be the probability that among ft pairs drawn from the sample, replacing each pair 
after drawing, b pairs are different and arranged according to pattern P, pair Qp appearing 
fip times (ft — 1 , . . . , 6) and the // pairs being drawn in a definite order, say R v R 2 , . . . , R h . 

Suppose, R x is a Q v Since any pair drawn may be taken as Q x (only the relative position 
of the pairs being relevant), the probability first to draw R x is 1. The probability that the 
second pair drawn is R 2 depends on whether R 2 has no, one or both elements in common 

with R v In the first case, it is 

arising from the fact that each of the two elements of R x can be the element common with 

R 2 ), and in the third case, 1 

In general, if the first A pairs drawn are R x , ..., R x , and if they form a pattern P' containing 
a different elements, the probability that the (A -f- 1 )th pair drawn is P A+1 depends on whether 

R x+1 has no, one or both elements in common with P'. In the first case it is 






f in the second case 2 (n — 2) 


(the factor 2 


in the second case, c f (n — a)l ( I, and in the third case, c 


, where c' and c" are 
independent of n . If, in the last case, is equal to one of the preceding R’ s, c" = 1. 



190 


Distribution of the rank correlation coefficient r 

B is the product of all p such probabilities, and it is seen from the above consideration 
that it is of the form /„ \ -?+i 

where C is independent of n. 

We also see that a pair which has already appeared before makes no contribution to C. 
Hence, C only depends on the different pairs of pattern P, and is independent of the 
numbers 

The above reflexion further shows that for any simple pattern contained in P, the pair 
drawn first, having no elements in common with the preceding pairs, contributes to C the ' 
factor except for the first pair, R v whioh yields the factor 1. Thus, C contains the faotor 
2~y +1 = 2~yi-Y +1 , and 2~vi is obviously the sole contribution to C from the single pairs 
(pattern S t ) contained in P. Hence 

hn\ 

B = 2-n(7'(» — 2)f“- 2 irl 

where C is independent of n and y 1 , and also independent of the order in which the y x 
single pairs are drawn. 

A , the probability that p pairs drawn form pattern P, irrespective of the order in which 
they appear, depends on n in the same way as B. As a function of y v A contains, besides 
2~7i, the factor l/y x ! owing to the fact that the y l single pairs are interchangeable. Further 
it contains the factor £*(/*) which indicates the number of ways of allocating p objects on 
b places so that no place remains empty. In virtue of (23) we have 

Thus, A is of the form A — D^(n - 2) 12 ri+« 21 j * , (30) 

where = 2-r. ( ?»t ^ d^^D' (31) 

and D' is independent of both n and y x and only depends on the pattern P containing no S v 
Inserting (29) and (30) in (27), we have 

E(p'Y = I DWpri V (n - 2)t*n+5-« ( 32 ) 

the summation taking place over all patterns with no more than p pairs. (32) also holds 
for p = 0 if by a ‘pattern of 0 pairs’ we understand the case y x = y t = ... = 0 and take (31) 
as definition of with suitably chosen D'. ft~ >] with 6 > 0 is defined by 

ft-'Kp + S)™ = 1. 

17. If p,(p') = E(p'-py, 

we have (j) ' \{p') - S # ( - tfQ (j) V (2) ^ ’ ^P'^ 

Applying (32) with p = v-6, y x = k—S, 

we have for the coefficient of p*v in I 1 p r (p') 

,?o ( - 1 Y 0 ( 2 ) * ^ = > >{n ~ 

- js ( - 1)* (J) J o ( - 1 Y Q - 2)®+«*-w-«. 


( 33 ) 



Wassily Hoffding 


191 


Inserting here, in acoordanoe with (15), 

»«-/> * ^ d§ + J?Z U) (n - d - 2 k + 

ir— 0 

we have (-1)* (j ) ± £ ( - \)» Q tyl ? (» ~ 

Putting a = o + 2/c— 2— p— cr, we have for the coefficient ^’•* ) of p*v(» - 2) tal in 
( 2 ) ^ (p,) ^?' a) = i o (-l)*Q^' a W, (34) 

where <#•“>(*) = £ (- lpQ<^S?SU-,,-.. (35) 

Since = 0 if a + 2* — a— p - 2 < 0, the upper limit of p in the summation 

may be taken as a + 2 ac — a — 2, which is independent of 8. We have then, in virtue of (31), 

«•“>(*) = &(b + K-8y»d v _ t ' V _ 6 _ K D' Z # (-lKQdfet^_._,_ 2 , (36) 

where Z>' is independent of 8. 

In virtue of (I) and (II), para. 10, a[^ x> (8) is a polynomial in 8. The degree of the (p+ l)th 
term in the sum in (36) is p + 2(a + 2K — a—p — 2), which is highest for p = 0. Hence, the 
degree d of a^ a) (8) is 

d = b + 2(v—b — K) + 2(a + 2K — oi — 2) = 2(i> + o + a: — a — 2) — b. (37) 

Now, according to (34) and (14), K\ J’’ a) = 0 if d < i>, or, in virtue of (37), 

= 0 if 2a>v — 4 + 2a — 6 + 2/f. (38) 

Applying (26) for pattern P, we have, since y 1 = 0, 2a ^ 36, and consequently 

2a — b + 2k ^ 2(6 + k), 

the sign of equality holding if and only if pattern P contains no other simple patterns 
than and S z . 

Remembering that, according to para. 16, b <p, we have in virtue of (33) 

5 + K , = i!> + y 1 + £ = £> + £<p + £ = j\ (39) 

Thus in any case 2a — b + 2K^2v 

and, in virtue of (38), K^’ a) = 0 if a ^ §i» — 1 . (40) 

If P contains at least one simple pattern with more than two pairs, we even have 

2a — b + 2k < 2v, 

and consequently K^ a) = 0 if a>|v — 2. (41) 

/ n \ v-l 

From (40) it appears that the degree in n of 1^1 p y (p') is 

< 3A — 2 = §y — 2 if v = 2h, 

^Zh-l = fy — $ if v = 2h + 1. 

Thus, in Ptk+i(P)> expanded in powers of n, the degree of the highest term is 

<3A-1-4A = -h-1. 



192 


Distribution of the rank correlation coefficient r 


In in virtue of (13), the degree of the highest term is — 1, provided that k— p* + 0. 

Hence, the degree of 

a th+i\P ) ~ 

is ^ -h-l+h + l = — It follows that 

«2»+i(J°')-^ 0 if k-p*> 0. (42) 

(k-p 2 < 0 is impossible since in this case var (p f ) would become < 0 for large n.) 


1 8. As we have seen, we may write 

i 2ft-l 


(j) /*»<!»'> = R h (n-2r»-* + ltl(n- 2p-s 1+ .... 

Then it follows from (41) that R h only contains terms depending on patterns <S, and 
that is, R h is of the form 

R h = 2S K? h A' 3h ~ 2) P K k A . (43) 


K A 


M-l 


The only terms in j E(p'Y which can contribute to this sum are of the form 

I}£pV'k\n- 2pi+ 3 *- 21 . 

The pattern corresponding to such a term is 

P = + 

Remembering tlie considerations in para. 16, we see that in each S 2 the pair drawn first 
contributes to 0 the factor \ (except if it is R x ), while the first drawing of the other pair 
yields the factor 2. Hence, the $ 2 ’s make no contribution to C\ and we have 

C = 2-71+ 1 . 

The contribution of the patterns S 2 to A is twofold: since in each S 2 the two pairs may be 
interchanged, this gives the factor (1/2!) A ; and since the A patterns S 2 may be interchanged, 
we have the factor 1/A!. Thus 

m _ 2 _ ri _ A+1 (7i + 2A)! , 

"rt ~ z yiA' 


and, in virtue of (31), since 6 = 2A, 


D' = 


1 

F-ur 


(44) 


Inserting in (38) v=2h, a = 3A — 2, a — 3A, 6 = 2A, (45) 


we see that K ® * sft ~ a) 4 = 0 is possible only if 



k + 2A ^ 2 h. 


On the other hand, from (39), 

k + 2A ^ 2 h. 


Hence, 

k = 2h — 2A. 

(46) 

Inserting this in (43), we have 




= s 

A 5=3 0 

(47) 


2A /2*\ 

= s o (- 1)'( s )oMr*w 


where 



Wassily Hoffding 


193 


Aooording to (37) in connexion with (45), (46), the degree in S of ag^j' 2) (<J) is 2 h. The 
highest term, a 0 <y®*, is contained in the term corresponding to p = 0 in (36). Inserting in 
(36) the values from (44), (45) and (46) and putting in the sum p = 0, we have 


2 -« + a +1 


Thus, in virtue of (16) and (II), para. 10, 

_ 2 - 2ft+ A +1 1 ( _ 1)A -A 

0 A! ' 2 h -*(h-A)\ " 2 h ~ l h\ \A/ ‘ 

According to (14), ^ )! • 

Inserting this in (47) we have 

"* 2*- 1 A! a -i, \Ar 2 *-«■*! p ' ' 

The highest term of p 2 h(p') i# thus 

2k (2 W(k-p*)»n-\ 


that of ju 2 (p) i 8 p 2 )n~ A , and hence 


*2h(P') 


/<2h(p')j m 

//§(/) « 2 h h\ 


if k — p 2 > 0. 


(48) 


From (42) and (48) it follows according to the Second Limit Theorem that the distribution 
of p' tends to normality as n-^oo, provided that the marginal distributions are continuous 
and k—p 2 > 0. 

The condition k — p 2 >0 is fulfilled if the population is distributed normally. For, com- 
paring Esscher’s formula (5) with (13), we find, since var (r) = 4 var (p'), 



The right-hand side is positive if | r | < 1 . 


V. The variance of p' in the case of a finite population 

19. Consider a sample of n drawn from a finite population of N in which all values of x 
and all values of y are different. For the sample probabilities p\ k\ (p 2 )\ ... we write now 

p (H) , (p 2 f n \ ..., 

and for the corresponding population probabilities 

/>(*>, JfcW (j> 2 p> 

Equation (9) remains valid and may be written as follows: 

EvP* = vf*K (49) 

In particular, Ejt n) = p (iV) . 

The essential difference between this case and the case N = oo considered above is that the 
composite probabilities such as (p*) iN) or (pk) (N) are not equal to (p (V) ) 2 or p^U N \ For 



194 


Distribution of the rank correlation coefficient r 

instance, (p^) 2 is evidently the same function of pW, (p*)Wand N as (p') 2 is of p', k', 

(p 2 )' and to. Thus we have, replacing « by N in (10), 

<*“">■ * iA w* r * w> + <“> 

and hence (P*r’ - ( 61 ) 

On the other hand, from (10) and (49) 

which is the equivalent of equation (11). 

On subtracting (50) from (52) we find 

var(p< n >) = + l) P ^ + 2[Nn-2(N + n- 1)]4W 

-[2Nn-3(N + n-l)](p*)W}. (53) 

Substituting for (p 2 )< V) , the expression in (51), we obtain 

+ 2(n-2)(N-2)(K*>- 1 / N >‘)}, (54) 

or ^“) var(p (n) ) = (* - ~P W ' ) + 2 ( n ~ 2 ) ( ] “ < 65 ) 

For N -> 00 , (55) becomes the same as (13). 


VI. A SAMPLE ESTIMATE OF Var (p') 


20. In the case of an infinite population, let 



g) var' (p') =p' + 2(» - 2) k' - (2 to - 3) (p 2 )'. 

(56) 

Then, in virtue of (9) and (12), 

Uvar'(p') = var(p'). 

On inserting in (56) for (p 2 )' the expression obtained from (10), we find 



( n 2 2 ) var ' (P') =p'( 1 -/) + 2 (»- 2 ) (^-(P') 2 ). 


or 

'' ar ' < ?' ) = (»-2)(»-3) ? '»' + »-3 ( *' (J>,),) ' 

(67) 

By analogy, 

, ' ar ' ( «' , = (»-2)(»-3) rt ' + »-3 (i ' <*?>■ 

(58) 


where V is the probability that among three members A, B,C drawn from the sample without 
replacement, one, say A, is discordant with the other two. 

In the case of a finite population of the type considered in para. 19, we define in a similar 
way a statistic var< n) (p< n) ) such that 

Ev ai< B) (p< n) ) = var(p< n) ). 



Wassily HSffding 


195 


We find 


2 (N-n) 


vari">(p<»>) * —-^-^—^(N+n-5)^{l-^) + 2(n-2){N-2)(W-&»% 

(59) 

A comparison between (59) and (54) shows that var <n) (p (n) ) is obtained from var (p (n) ) 
by interchanging n and JV and taking the opposite sign. 

21. Let g v and be the numbers of sample members concordant and discordant with 
A y = (*„, y,) {v - 1 , ...,n). The probability of drawing first the member A„ and then, 

1 0 

without replacing it, a member concordant with A, is - . The probability of 

drawing, without replacement, first A v and then two other members concordant with A„ is 

- Hence 
n{n- l)(n-2) 



p 1 - 

1 n(n- 1)’ 

y m . 

n(n- l)(w-2)‘ 

(60) 

Similarly, 

q' = . *h , 

H n(n- 1) 

n(n-l)(n-2)' 

(61) 


If only the value of p' or q' is required, the use of (6) may be more expedient than that of 
(60) or (61). If, however, the variance, and hence k' or l', is wanted, the calculation by means 
of the numbers g v and h v (whose sums are twice the numbers of inversions /, J) according 
to (60) or (61) is to be preferred. 

If p' > it is more convenient to calculate q’ and V from (61 ); if p < I, the calculation of 
p' and k' by (60) is more rapid. In many cases one can see directly from the given data 
whether the concordant or the discordant pairs prevail, before actually calculating p' or q'. 

Since p'+q' = 1, we have var(p') = var(g , ') ) 

and also, in the case of a finite population, 

var (pf n) ) = var (^ n) ). 

If we write down the equation for var (tf n) ) analogous to (55) and subtract it from (55), 
we have ' = 0, 

or J few) _ jw „ y w _ = j/w _ jtw. 

Substituting n for N, we have 

k' -V - p' -q' = r. 

Comparing this with (67) and (58) we see that 

var' (p') = var' (q'). 



196 


Distribution of the rank correlation coefficient r 


REFERENCES 

Esscher, F. (1924). On a method of determining correlation from the ranks of the variates. Skand. 
Aktuar. 7, 201-19. 

Jordan, Ch. (1927). Statistiique mathematique. Paris: Gauthiers-Viilars. 

Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30, 81-93. 

Lindkberc, J. W. (1926). Ueber die Korrelation. Den VI skandinaviske Matematikerkongree i Keben- 
havn K 31 August-4 September 1925, pp. 437-46. Kebenhavn: J. Gjellerup. 


ADDENDUM 

On p. 184 above, I quoted J. W. Lindeberg as having given the formula for the variance 
of the probability of concordance // without proof. I was not aware then that a proof of 
this formula, as well as that of the corresponding expression for a finite population 
(equation (54) of my paper), is contained in another paper by Lindeberg, ‘ Some remarks 
on the mean error of the percentage of correlation,’ Nordic Statistical Journal , 1, 137-41 
(1929). 



[ 197 ] 


THE SIGNIFICANCE OF RANK CORRELATIONS WHERE 
PARENTAL CORRELATION EXISTS 

By H. E. DANIELS ( Wool Industries Research Association) 
and M. G. KENDALL 

1 . All the known tests of significance of rank correlation coefficients are based on dis- 
tributions from a population in which each possible ranking occurs equally frequently, 
i.e. the null case where no parental correlation exists. We may then say of any particular 
coefficient whether it is significant in the sense that it cannot have arisen with any acceptable 
probability from an uncorrelated population. No tests are known in the case where parental 
correlation exists, and we have not seen the point discussed except in reference to the 
replacement of rank correlations by grade or product-moment correlations. Thus, for 
example, if two rank correlation coefficients are both found to be significant there has 
hitherto been no exact method of deciding whether their difference is significant. In this 
paper we consider the problem of determining confidence intervals for a rank correlation 
when the parent is correlated and develop a test of significance for the difference of two 
correlations. 


2. In testing an ordinary product-moment correlation the problem is enormously 
simplified by the assumption that the population is normal, or the further assumption that 
normal theory holds good even when the parent deviates only moderately from normality. 
Apart from means and variances the population is then completely specified by the single 
parent parameter p and, as is well known, the sample distribution of the estimator depends 
only on p and the sample number n. 

In ranking theory this position no longer obtains. No assumption can in general be made 
about the form of the parent distribution and, in particular, the parent correlation does not 
completely specify the problem. The usual typo of variate theory cannot, therefore, be 
expected to meet the requirements. 


3. A satisfactory approach to the problem can, however, be made if the rank correlation 
is measured by the coefficient known as r (Kendall, 1943, chap. 16). We shall then show that, 
for large samples at any rate, the problem admits of a solution. 

Let the population consist of N members. They may be imagined as laid out in the natural 
order 1, 2 , ..., N according to the first variate. The rankings according to the second variate 
are then some permutation of the numbers 1 to JV, and this second array of ranks is all we 
need write down in particular cases. It determines the rank correlation r. Now suppose 


we choose a sample of n in one of the ^ j possible ways. This sample will, so far as the first 

variate is concerned, be in the natural order, and the ranks according to the second variate 
permit of the calculation of a sample correlation t. For all possible samples and any given 

arrangement of the parent members there will be a distribution of values of t . 



198 Significance of rank correlations where parental correlation exists 

4. The sample value of t is an unbiased estimator of r; that is to say, the mean value of 

t in all possible samples is r. For consider the samples of n. Any particular pair of 

t N — 2\ ' n ' 

members will occur in I I samples, that is, all pairs occur equally frequently in the 

totality of all samples. In calculating t we assign to any pair + 1 if its members are in the 

( N — 2\ 

n — 2/ 

times the score for the population. To obtain t we divide the score for any sample by Jn(» — 1), 
and to obtain r we divide the population score by \N{N - 1 ). Hence if E is the score for the 
population, the mean value (expectation) of t is 

P,A_ U-2/ _ S ,.4.1* 


m = 


w ” -*>(») 


N\ IN(N- 1) 


5. Unfortunately, it is not true that higher moments of t depend only on r. A single 
example will illustrate the point. Consider the ranking of 9: 

5 2 3 1 6 7 8 9 4. 

If the 84 = possible samples of three are written down and t evaluated for each, the 
distribution of S (the number of positive pairs) is found to be as follows: 


f 

— 

Values of & 

Frequency 

0 

i 

2 

1 

15 

2 

34 

3 

33 

Total 



84 

i 


The mean of this distribution is 182/84 = 13/6, and since 

\n(n~ 1) ’ 

the mean value of t is (26/18) — 1 = 0-44. The value of S for the parent ranking is 26 and hence 
r = (62/36) - 1 = 0-44, verifying equation (4-1). The ranking 

125936784 

also has r = 0 44, but the distribution of 8 in samples of three is now: 


Values of S 

Frequency 

0 

3 

1 

16 

2 

29 

3 

36 

Total 

84 



H. E. Daniels and M. G • Kendall 199 

The second moment of this distribution is 5-429, against 5-333 for the first distribution, the 
variances being 0-734 against 0-639. 

6. Thus for any parent with given r there is in general more than one sampling distribution 
of t according to the arrangement of the parent ranks. In short, as mentioned above, the 
parameter r does not completely specify the sampling distribution and in asking the question : 
What is the standard error of t\ we are seeking for an answer which does not exist. 

It will be shown, however, that for any given parent ranking the distribution of t tends to 
normality with increasing n. The sampling properties of t can therefore be specified to a 
first approximation by its first and second moments only, when the samples are not too small. 
Further, it will be proved that for given r the variance of t cannot exceed a certain function 
of r and n whatever the parent ranking. From a knowledge of t and n only, it is thus possible 
to set outer bounds to confidence intervals for r provided n is large enough for the normal 
approximation to hold. The limits obtained in this way are sometimes rather wide, and an 
alternative procedure is to estimate the true variance of t directly from the sample itself 
according to a formula given below. This avoids the loss of efficiency consequent on using an 
upper limit to the variance, but it is not known how large a sample is required for the error 
of estimation to be tolerable. 


7. The development of the theory is facilitated if we introduce at the present stage a 
notation similar to that used by Daniels (1944). The ith and jth ranks corresponding to the 
second variate are together assigned a score a tj which takes the value 4* 1 if the members 
are in the correct order, — 1 if in the wrong order, and a u is defined to be zero. The ranks for 
the first variate are similarly assigned scores b { j, but as the members have been taken in the 
correct order for this variate, the scores are simply = ± 1, i < j ; b n — 0. Next we define 
r.fj = a^bjj* so that = ± 1, according to whether the ranks for the two variates agree or 
differ in order, and r ti — 0. In this notation 

t = c/N(N — 1 ), 

where c — i and j both being summed from 1 to N. 

When the sample of n pairs is selected at random from the parent X and its coefficient 

t is calculated, the values of for the members of the sample remain the same as in the 

population. This fact makes r much more suitable for the present problem than the Spearman 

coefficient p whose associated scores do not possess the same property. The sample rank 

correlation is then , ... . 

l s cS n >ln(n— 1), 

where c (w) — £ (,,) c and L (n) denotes summation only over those values of i m\d j occurring 
in the sample. 


8. It has already been proved that E(t) — r. To find the variance of t we require E{t 2 ), 
so consider v . v v ,, n 

2 | c <' 0 ] 2 = 


S denoting summation over all selections of the sample of n from the finite parent population 

n 

of N members. Let us enumerate the number of ways in which c fj c kl and similar products 
with 4 tied’ suffixes, such as c ti c iv occur in the sum. 

Biometrika 34 


14 



200 Significance of rank correlations where parental correlation exists 


selections of the 


IN - 4\ 

(i) When i, j, k, l are all different the term c {i c u may occur with 1^1 

. /N- 4\ 

remaining members of the sample and the contribution of such terms to E is I ^ I S'c w c fcI , 
S' meaning summation over all unequal values of i, j, k, l from 1 to N. 


(ii) The term c i} c it similarly occurs in 


a 


ways and there are four ways of tying one 


suffix, each of which gives the same contribution to 2 since c {j is symmetrical. The total 

( iV — 3\ w 

) E'CfjCtf. 

n— o J 

( N — 2\ 

2 S ' C « C « to S, and all other terms are 
zero since c i( — u. rrenee “ 

Expressing the 2'’s in terms of the corresponding 2’s and dividing out by & obtain 


^(4) 47 , f v ‘ ,/ lift}*' 

E{c^] 2 = ijCjcl ~~ ifCn -f 2ljCjjC i j) + 4* jy(2) 


4 n™ t 


->«(») 


where n (r) = n(n— l)...(?i~r+l). Since '£*c ij c ij = N(N — 1) and 2fyc w = c 2 , the variance of 

N 

t for given r and n is seen to depend on the value of 2 c tj c ik = 2c?, where c t = 2c i; -. 

1 

Let N become large. The quantities c and 2c? are respectively 0(N 2 ) and 0(A 73 ), so if we 
introduce r £ = cJN the value of E(t 2 ) for large N becomes 


E(t 2 ) 


(n- 2)(n- 3) 4(tt~2)2r? 2 

n(n—l) n(n-l) N n(v — \y 


and hence in the limit the variance of t is 


var t = 


4(n-2) 
n(n- 1) 


var r i + 


n(n~ 1) 


(1-r 2 ). 


( 8 - 1 ) 


9. The variance of t satisfies the inequality 

var (1 -r 2 ), (9*1) 

n 

whatever the parent ranking. Moreover, though the limit may not be attained in any 
particular parent ranking, reasons are given in the Appendix for expecting that it cannot 
be substantially improved upon. The proof is as follows. 

Reverting to a finite parent population of N members, we first seek a maximum for 2c?. 
In terms of the original scores, Keeping b {j = ± 1, i b u = 0, as before, allow 

the a { /& to assume any values subject to the conditions 

2af, = N(N — 1 ), 2 a^ « c = N(N - 1 ) r. 

The stationary values of 2c? occur when the a^’s satisfy the equations 

b{j( c i + Cj) — Aa t ^ — fib^ = 0, 



H. E. Daniels and M. G. Kendall 


201 


which give, on multiplying by and summing j, 


c t - 


/i(N — 1)— c 


(N- 2-A) ' 

Thus, unless the c/s are all to be equal, in whioh case Scf is a minimum, A and /i must take 

X = N-2, n = c/(N-l), 


the values 


and since 2Ec} - A N(N — 1 ) -/ic = 0, 

it follows that Ecf cannot exceed ±N(N — 1) (N — 2) + %c*/(N — 1). Allowing N to become 
large, this implies St */ 


Hence varr^ < £(1 — r a ), 

2 

and so from equation (8* 1 ) var t^-( 1 — r 2 ). (9- 1 ) 

u 


10. Assuming that the sample is large enough for the distribution of t to be normal, the 
roots t v r a of the equation /p> -t 

y[; (i - T,) ]’ 


( 10 - 1 ) 


i.e. 



( 10 - 2 ) 


provide confidence limits to r when t is known, x being the standardized normal deviate 
corresponding to a given probability of P %. These confidence limits are of course maxima, 
in the sense that wc shall be wrong in at most P % of the cases in asserting r to lie between 
the calculated limits. 

In our proof of the tendency of / to normality it will be necessary to neglect terms of order 
and the sample may have to be rather large for such terms to be small, unless r itself 
is small. 

The form of equation (9*1) suggests using 


w = sin~ x t 


instead of t. To the same order of approximation we can take w as having a normal dis- 
tribution with mean o) = sin -1 7 and standard error not exceeding yj(2/n) y which is indepen- 
dent of t. This form is more convenient for assigning confidence limits to t 3 and for testing 
the significance of the difference between t x and l 2 (whose standard error cannot exceed 
^[2(1/%+ 1 /n 2 )]), hut we have not been able to discover whether the transformation brings 
the distribution nearer to normality. 


11. We now prove that the distribution of t tends for large n to normality whatever the 
parent ranking, provided that | r | is not near unity. 

Write g if = c^-c/N 2 so that 2j g tj = 0 , g iS = g j( and g u = - c/N 2 = -(N- 1)t/N. The 
rth moment of c (r,) about its mean value is so consider 

2 [ ^%r^^%g M g uv ^ 

n n 

the summation E being over all possible sample selections. 

n 


14-2 



202 Significance of rank correlations where parental correlation exists 

The argument used by Daniels ( 1944) to show that in the null case the distribution of rank 
correlation in large samples tends to normality can be applied with little modification to the 
present problem. The proof is therefore sketched here without much detail. 

Two essential conditions to be satisfied are that Eg^ = 0, which is true by definition, and 
E g tj g ik = 0(N 3 ), which is true only if 1 — t 2 = 0(1), so that the tendency to normality may 
be expected to break down for high correlations. 

The sum S is evaluated as in § 8 by counting the number of ways in which terms like 

n 

9ij9td9vv • • • » and similar terms with tied suffixes, occur. In this way it is expressed as a linear 
combination of etc. Every such S' is replaceable by the corresponding S 

together with terms containing more tied suffixes which are of lower order in N since they 
involve fewer summations from 1 to N. 


12. First consider the even moments with r = 2m. Terms containing more than 3m 
different suffixes must vanish, since in such cases it is impossible to avoid at least one g {J 
with two free suffixes, and = 0. For the same reason the only non-vanishing terms with 
3m different suffixes are those containing expressions like 


^QijOikOluOlrUpqUpr ••• ( 5 9ij9ik ) > 

and terms with fewer different suffixes are of correspondingly lower order in N. 


With 3m suffixes assigned there are 


/tf-3m\ 
\n ■ — 3m J 


ways of selecting the remaining n ~~ 3m 
(2m) ! 2 2m 

members of the sample, and the suffixes can be tied in ways to give the same result. 

Dividing out by j and noting that ^ 2 m} f (/! ) ~ n3m /N 3 ‘" when both N and w are large, 
the contribution of such terms to // 2m , the 2rath moment of c (n) about its mean, is found to be 


n 


3m 


(2m)! 


N 3m m\ 


f2™(Lg tj g ik Y 


which is of order n 3m . Moreover, by the same argument, terms with /< 3m different suffixes 
add contributions of order n f which may be neglected. 


Hence 


Min 


7V 


3m 


(2m)! 


JfBm 


m ! 


2 '*&g u g ik )' 


the neglected terms being relatively 0(n~ l ). 


13. For the odd moments let r = 2m -f 1. Similar considerations show that the non- 
vanishing terms of 2 cannot have more than 3m -f 1 different suffixes, and /i 2tn { x is therefore 

of order n 3m+1 . 

Then since 6 n) jv) has even moments of unit order and odd moments of order n *, the odd 
moments may be neglected to that order. We conclude that c (n) is distributed normally for 
large n with variance 4^3 

= 4 « 3 va rT t ., 

and t is similarly normal with variance (4/n) var t { . 

14. The fact that terms of order have to be neglected suggests that the normal approxi- 
mation only holds good for fairly .large samples. This is not surprising since one would expect 
skewness to be an important property of the distribution of t when r is not zero, if only for 
the reason that | i | can never exceed unity. It seems worth while to examine the odd moments 
in more detail. 



distinct ways, and there are 


H. E. Daniels and M. G. Kendall 203 

The dominant term of the (2m + l)th moment has 3 m + 1 different suffixes, which can 

oocur as ^g {j g i k9ti(^9uv9uw) m l or 

r . .. . , . . , . (2m + l)!2»"-»2« (2m + 1)! 2 m+2 

Both can be obtamed in rwvr ~ ,,, - — 

(2!) m_1 3!(m— 1)! 

IN- 3m -1 
[n-3m 

assigned. The (2m 4 - l)th moment of c (7,) about its mean is therefore 

w 3m+i (2m -f 1)! 2 MM2 

/ l 2 m-t 1 ~ jyamTl 3!( m _l J j* " + ^0ij9ik9fl] (E , 9ij9ik) m ~ 1 > 

ignoring terms of relative order 0(n~ l ). The corresponding moment of t is obtained to the 
same order on dividing by n 4m + 2 ; it depends only on var t and /^(J), where 

PS) ~ ^9ii9ik9n + ^9ij9ik9ji] = ^2 


■c 


3!(m— 1)! 

* j ways of selecting the sample with 3m 4 - 1 suffixes 


n * 


2V 4 


JV 


where ^ The distribution of t is thus specified to O^- 1 ) by its first three moments. 

j=i 

The moment-generating function of the distribution of t in standard measure is 

M(z) = + + 

7i = // 3 (<)/(var<)* = 0(n~*), 


(14-1) 


where 

and the frequency distribution of x — (t~T)lyJ(\a,vt) is* 

15. The effect of the 7 , term in modifying the confidence limits based on normal theory 
can be seen in the following way. Let £ be the normal deviate whose chance of being exceeded 
is /*(£). The chance of x exceeding £ is, from (14-1), 

TO-JW+?<P-Dj5g. 

If A" is the correct limit such that F(X) = P(£), it is readily proved by successive approxima- 
tion that the formula 

(16*1) 


gives the appropriate value of X to 0(n~ l ). For example, the 5 and 1 % limits are respec- 
tively ± 1 96 + 0 - 4747 ! and ±2*58 4 - 0 - 9417 !. 

16. In practice the value of va>rt has to be estimated from the sample, and although its 
standard error can be shown to be 0(n ~ *) by the kind of argument already used, it is not 
known how large the sample has to be before the error in estimating the variance can be 
safely ignored. It is best to use the unbiased formula 

1 


1) (»- 2) (« -3>{ 4Sc? - «(?- i 3 ) )c * _ 2m< ” _ 0 


(16-1) 


(which is easily proved) in calculating var t from the sample, especially if the standard error 
of the mean value of t from a number of small samples is required. 

* Note that the approximation error in f(x) is relatively 0(n~ l ) t a stronger result than would be 
obtained from a Gram-Charlier approximation based on the first three moments only. 



204 Significance of rank correlations where parental correlation exists 

As the term in y 1 is a small correction it is perhaps sufficient in moderate samples to take 

Hr 2 

0 = iSMfc+fc) 1 - PatCi+cf-^P+%. (16-2) 

and p i (t) = ^ l G, y t = /t,(t)/(var<)*, (16-3) 

71 

where the first term in O is the sum of c^(c t + Cj) 2 over all values of i > j. The unbiased formula 
for involves some rather tedious computation. 

17. To illustrate the methods of the paper we consider an actual example. 

A set of thirty wool samples were visually graded in order of fibre fineness by three assessors. 
The mean fibre diameter for each wool sample was also determined by direct measurement. 
Table 1 shows the measured order (M ) compared with that of the three assessors ( A , J3, C) t 
in ascending order of experience. 


Table 1 


M 

A 

B 

C 

M 

A 

B 

c , 

i 

' " 

1 

5 

2 

1 

16 

12 

14 

16 

2 

4 

5 

2 

17 

10 

18 

15 

3 

9 

6 

0 

18 

30 

i 21 

25 

4 

3 

1 

3 

19 

22 

26 

24 

5 

6 

7 

4 

20 

10 

! 22 

19 

6 

2 

4 

5 

21 

21 

! 1( * 

18 

7 

15 

19 

10 

22 ! 

29 

20 

23 

8 

18 

3 

12 

23 

28 

1 25 

22 

9 

i 8 

8 

7 

24 

19 

27 

! 26 

10 

11 

9 

8 

25 

23 

1 28 

21 

11 

17 

13 

9 

26 

20 

1 23 

27 

12 

13 

10 

11 

27 

7 

| 24 

20 

13 

24 

17 

17 

28 

26 

1 29 

28 

14 

14 

12 1 

14 

29 

27 

1 15 

30 

15 

1 

" i 

13 

30 

25 

| 30 

29 


The method of working will be seen from the c tj matrix for the MA correlation shown 
in Table 2. 

The correlations of the assessors’ orders with the measured order are found to be 

t A = 0-490, t s = 0-724, t c = 0-810. 

(i) Consider first the maximum confidence limits given by (10-2). The 5 % limits are 
— 0-02 < t A < 0-80, 0-23 <t B < 0-92, 0-34 </ r < 0-96. 

Again, using the transformation w = sin -1 /, the 5 % limits are 

0-01 <t A < 0-85, 0-30 <t B < 0-97, 0-45 < t c < 0-99. 

The values of w are w A = 0-512, w B = 0-810, w c = 0*954. 

The greatest difference is 0-442, and the upper limit to its standard error is ^(4/n) = 0-365, 
so on these grounds the difference between A and C would not be judged significant. 

The 5 % limits are very wide, and the lack of significance is disappointing since C was 
known to be an expert appraiser while A is relatively inexperienced, and one would have 
expected an obvious difference between them. 



H. E. Daniels and M. G. Kendall 205 

(ii) The variances estimated from the unbiased formula (16-1) are 

var^ = 0-006630, var< B = 0-006067, vari c = 0-002198. 

The estimated standard errors are therefore 

8 a = 0-081, s B = 0-071, a 0 = 0-047. 

The 6 % confidence limits, assuming normality, are 

0-33 <^<0-65, 0-68 <t B < 0-86, 0-72 <t c < 0-91. 

Table 2 

c it «< 

0 — 4 — 4 — 44444444 — 444444444444444 21 

-04-4-44444444-444444444444444 21 

4 4 0 - — — 4 4- 44444-44444444444-444 17 

---04-44444444-444444444444444 19 

•+* *1” 0 — 44444444 — 444444444444444 23 

_____044444444-444444444444444 17 

44444404--4-4----444444444-444 13 

44444440---- + -- -- + + -+ .+ + + + + - + 44 9 

44-444--044444-44444444444-444 19 

4 4 4 4 4 4 — — 4 0 4 4 4 4 — 4 — 444444444 — 4 4 4 19 

4444444-44 0 -4----44-444444-444 13 

444444--44-044---444444444-444 16 

4444444444440----4---44----444 7 

4 4 4 4 4 4 — — 44 — 4 — 0 — — — 444444444 — 444 13 

— — — — — — — — — — — — — — 0444444444444444 1 

444444--44----40-444444444-444 13 

444444--4-----4- 0 444444444-444 11 


44444444444444444 0 — — — — — — — — — — — — 6 

444444444444-4444- 0 - - + + - + -- + + + 16 

4 4 44444 — 44 — 4-4444 — - 0444444 — 444 17 

444444444444-4444--4 0 44-44-444 19 

44444444444444444 — 4440 — — — — — — — — 11 

*4" "4* *4* + + + “t* 4 "4~ 4~ "4" + + + "4” 4* -4 — 4 4 4 — 0 — — ™ — —• 11 

44444-4-4-44444-444-4 — — 4--- — 04-4- — 44-4- 16 

4' 44444444444 — 4444-444 — — 40- — 444 17 

444444444444 — 4444 — — 44 — — 4— 0-444 15 

44 — 444 — — — — — — — — 4 — — — — — — — — — — — 0444 —11 

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 — 4 4 4 S - -f- ^ 4 0 4 — 21 

44 + + + + -l- + -(- + + -t--l- + 4. + + — + + + — — + 44440 — 21 

44444444444444444 — 444 — — 4444 — — 0 19 


c = 426 
n= 30 
£cj = 7470 

Moreover, we should judge A and C to be significantly different at the 1 % level, and A and B 
at the 5 % level. How far these conclusions are valid depends, of course, on the accuracy of 
the varianoe estimates, but the conclusions seem to agree with what might have been 
expected from prior knowledge of the assessors’ capabilities. 

(iii) The values of calculated from (16-2) and (16-3) are 

y x (A) = -0-32, y 1 ( J5) = — 0-36, ri (C) = - 0-38. 

The distributions would not appear to be very skew, and the distribution of the difference 
of two t' s is probably nearly normal. The adjusted 6 % limits are, from (15* 1), 

0-32 <^<0-64, 0-57 << B < 0-85, 0-72 <t c < 0-90. 



206 Significance of rank correlations where parental correlation exists 

APPENDIX 

1. The question arises whether a particular parental form exists for which the variance 
of t assumes the upper limit 2( 1 — r 2 )/n. We surmise, though we cannot prove, that the maxi- 
mum possible variance is attained when the parent ranking has a ‘ canonical’ form obtained 
in the following way. Consider again the ranking 

52316789 4. 

The number of positive pairs 8 is 26, so that t = 0*44. Let us transform this so as to bring 
the 1 to the beginning of the ranking but move the 9 so as to preserve the number 8 at 26. 
The 1 passes over three members to go to the beginning and hence adds 3 to the score. The 
9 must, therefore, proceed to the left over three numbers so as to subtract 3 from the score 
and we reach 15239678 4. 

Now operate similarly with 2 and 9, reaching 

12593678 4. 

Had our 9 been contiguous to the 1 and incapable of moving farther to the left, we should 
have moved the 8 and so on. Proceeding with the process by moving back the 3 and the 9 
and H we reach 1 2 3 9 5 6 8 7 4, 

and again 1 2 3 4 9 8 7 6 5. 

All the lower numbers 1 to 4 are in the right order and the remainder are in the inverse order. 
We call this ranking the ‘canonical’ order for given 8 (or t). It is not always possible to 
reduce a given ranking to canonical order, but there cannot be more than one individual 
out of place. 

2. Consider the effect of a series of transformations leading to the canonical form. The first 
process, that of moving 1 and 9, will increase the value of S for some samples involving l but 
not 9 (leaving the others unchanged), will decrease the value of 8 for some samples involving 
9 but not 1 (leaving the others unchanged), and will, in general, not alter those involving both 

1 and 9. Similarly for 2 and 8, and so on. The effect of the transformation is thus to increase * 
the values of 8 containing the lower numbers 1, 2, 3, etc., and to decrease those containing 
9, 8, 7, etc. These values of S are themselves, in the canonical form, the greatest or least as 
the case may be. Consequently the progress to the canonical form is accompanied by 
increases in the number of high values of 8 and increases in the number of lower values, and 
one might expect the spread of the distribution to tend to a maximum. In the example 
quoted, the distributions of 8 in samples of 3 for the successive rankings are: 


Values of S 



Frequencies / 



0 

2 

3 

3 

6 

10 

l 

15 

13 

16 

10 * 

— 

2 

34 

35 

29 

32 

40 

3 

33 

33 

36 

36 

34 

Totals 

84 

84 

84 

84 

84 


The sums E/VS’ are all equal to 182. The sums E/$ 2 are respectively 448, 450, 456, 462 and 466, 
showing the canonical ranking to have the largest variance of the five. 



H. E. Daniels and M. G. Kendall 


207 


3. There is, however, another way of carrying out this process. If the parent ranking is 
inverted, r becomes -r, but the variance of samples of n drawn from the inverted ranking 
remains the same, by symmetry. We may then reduoe the inverted ranking to itB canonical 
form and reinvert it so that its coefficient is again r. This ranking we oall the inverse canonical 
form. It will be shown that for large N, when r > 0 the inverse canonical form yields a larger 
variance for t than the direct canonical form. 

Even in the example already quoted, the inverse canonical ranking (with one member 
out of place) is 34256789 1, 

which has a distribution 


Values of 8 

f 

0 

2 

1 

27 

2 

10 

3 

45 

Total 

84 

_ 

* . 


The sum X/ S 2 is now 472. which is greater than the previous maximum 466. 


4. Consider the canonical case when there are N members altogether, R at the beginning 
in the right order, and N — R in the inverse order. If we select w -j members from the R 
and j from the N — R the value of & for the sample of n is \n(n — 1 ) — \j(j — 1), and the relative 


frequency of U - \n(n— 1 ) — *S* is 




Now suppose that N tends to in- 


finity and R/N to the ratio p. The relative frequency of U — \j(j — 1 ) tends in the limit to 





where q = 1 — p. The mean value of V is then 




and since 


\J 

1 = 1 - 


2U 


in(n-l)q z , 

i)’ 


we must have q = {£(1 — r)}*. (4-1 A) 

The variance of U is var U = n(n — 1 ) pq 2 {nq — £(1 — 3q)}, (4-2 A) 

and so vart = I6pq*{nq — £(1 -3q)}ln(n- 1). (4-3A) 

5. If now the inverted parent ranking is reduced to canonical form, giving ratios p', q' 
corresponding to p and q, we shall have # 

q' = M i+T)] (s-i A) 

and var t’ — 1 6 p'q' % {nq' - 1( 1 — 3q')}/n(n - 1 ). (5*2 A) 

Then since q 2 + q' 2 1, 

var/'-varf = («'-?) (1 -?)(1 -?')• (5-3A) 

When r is positive, q’ > q and var t' exceeds var t. 



208 Significance of rank correlations where parental correlation exists 

This result suggests that the maximum variance may be attained by the inverse eanonioal 
ranking when r > 0 and by the direct canonical ranking when r < 0. With this ohoioe of 
parent ranking the variance of t for large n is 

var t ~ ^ ( 1 + 1 r | )» {1 - Vti(l + 1 r | )]}. (5-4 A) 

rl 

It is interesting to compare (5-4 A) with our upper limit of 2(1— r 2 )/w. Their ratio is 
{2(l + |r|)}*/[l + {i(l + |r|)}l], which varies from 2(^/2 — 1 ) = 0-83 when r = 0 to 1 when 
r = 1. Evidently the upper limit to the variance cannot be muoh improved, since an actual 
ranking has been found whose variance approximates to it for all values of r, when n is not 
too small. 

REFERENCES 

Daniels, H. E. (1944). The relation between measures of correlation in the universe of sample per- 
mutations. Biometrika, 33,129. 

Kendall, M. G. (1943). The Advanced Theory of Statistics, 1. 2nd ed, 1945. London: Charles Griffin 
and Co. 



[ 209 ] 

TESTING FOR NORMALITY 

By R. C. GEARY, Cambridge University Department of Applied Economics 

1. Introduction 

The present communication, one of a series, has two main objectives: 

(1) To show that probabilities derived from the well-known analyses of variance and 
other ‘ small sample’ tables, which postulate universal normality, may differ seriously from 
the true probabilities when the universes are non -normal, even, in some cases, when the 
degree of non-normality is not considerable. 

(2) To determine the most efficient tests of normality from a wide field of alternative 
symmetrical tests. 

It may be useful to summarize very briefly previous work in so far as it is strictly relevant 
to this study.* The modern theory may be regarded as having been initiated by Karl Pearson 
who, in 1895, found the first approximation (i.e. to n~*) to the variances and covariance of 
yjb t and b 2 for samples drawn at random from any universe and, assuming that the <Jb x and 
b 2 were distributed jointly with normal probability, constructed ‘probability ellipses’ from 
which the probability of the same values occurring, had the universe, in fact, been normal, 
could be inferred very approximately. A considerable advance in moment determination 
was made by (A (J. Craig (1928). In 1929, R. A. Fisher, in inventing cumulants, simple func- 
tions of the sample moments, and formulating rules for finding their semi-invariants, 
developed incidentally a technique for expanding to several terms in 1 jn the moments of 
<Jb 1 and b 2 when the universe was normal. This paper was followed soon after by another 
( 1 930), fundamental for all succeeding work on this subject, in which R. A. Fisher ingeniously 
applied combinatorial technique to the finding of exact values of the moments of normal 
y jb l and 6 2 , and gave inter alia the values of the second, fourth and sixth moments of <Jb x 
and of the first three moments of b 2 . The fourth semi-invariant, together with many other 
normal semi-invariants of 6 2 , was determined by J. Wishart in 1930, and a further advance 
in R. A. Fisher's technique was made jointty by R. A. Fisher & J. Wishart in 1930. In 
1932 Joseph Pepper gave the eighth normal moment of <Jb v Using R. A. Fisher’s rules 
C. T. Hsu and D. N. Lawley in 1940 gave the exact values for normal random samples of 
the fifth and sixth moments of b 2 . Using a method due to R. C. Geary (1933) (applying 
C. 0. Craig's ideas (1928) to the normal problem), R. C. Geary & J. P. G. Worlledge have 
recently (1940) found the seventh moment of b 2 . 

So much for moment determination. In 1930, E. S. Pearson used appropriate Pearson-type 
curves, applied to R. A. Fisher’s (1929) approximations of the semi-invariants, to find 
approximate frequency distributions of <Jb 1 and fe 2 . From the frequency distributions he 
computed a table of 1 % and 5 % probability points at intervals for n from 50 to 5000 for ^6, 
and for n from 100 to 5000 for 6 2 . 

Since at the time the prospect seemed remote of determining the frequency of normal b 2 
on which reliance could be reposed for samples of moderate sizes, R. C. Geary (1935)f 
suggested that the ratio, a, of mean deviation to standard deviation computed from the origin 

* An excellent account of the development of moment theory up to the year 1930 was given by 
J. Wishart (1930). 

t The author was informed by M. Fr6chet that this test was suggested by Bertrand, but has been 
unable to check the reference. 



210 Testing for normality 

might be used as a test of normality, and gave the 1 and 5 % probability points for this test 
at intervals for normal samples of 6-100. E. 8. Pearson compared experimentally Geary’s 
test with b 2 and suggested, for samples so large that comparison could safely be made, that 
b 2 was probably somewhat more sensitive than a, a suggestion which will be examined 
theoretically in this communication. In 1935 also, R. 0. Geary showed that there was a 
high (negative) correlation for normal samples between a(\) (see 3*1) and b 2 for normal 
samples, and argued therefrom that the former should be nearly as efficient as b 2 . In 1936, 
R. C. Geary gave a table of 1, 5 and 10 % probability points of a(l) at intervals for samples 
of 11-1001. In 1938, a brochure by R. C. Geary & E. S. Pearson was published by the 
Biometrika Office entitled Tests of Normality , giving tables and diagrams of probability 
points of a(l ) <Jb x and b 2 . There is considerable literature dealing with the effect of universal 
non-normality on the normal tests, mostly by way of particular numerical examples: a selec- 
tion of papers on this subject is included in the list of references at the end of the paper. 

2. Effect of non-normality 
(a) The z-teM 

The effect of universal non-normality will iirst be considered in relation to the 2 -test. 
lfx v x 2 , and are two independent samples drawn at random from the 

same universe (normal or non-normal) it is easy to show that, if 


n 



„ . 2 (.r, -.r) 2 

/«: > 



, n - 1 

JIog> 

(21) 






1 



then 


M,. 

(2-2) 


4 \n n ) 



when both ri and n" are so large that terms in n' and n" of degree less than — 1 are regarded 
as negligible. This is an obvious generalization of the approximate formula given by 
R. A. Fisher* for normal samples, namely, 



It may be useful also to give formulae for the first and second moments from zero for z 
when the two random samples are drawn not necessarily from the same universes, though 
both universes have mean zero and the same variance A 2 : 



* & Statistical Methods for Research Workers , 8th ad. p. 219. 



R. C. Geary 211 

where the A’s indicate semi-invariants of the two universes of the orders indicated. In these 
formulae, in effect, terms to order —2 inn', n" are retained. 

When both samples are large the frequency distribution of z will approach normality 
provided that /i t is finite. The effect of universal kurtosis can accordingly be assessed in a 
very rudimentary manner from (2-2) and (2*3). The z -deviate £ corresponding to, say, the 
21 % normal probability point is ^ = 1-96()0 ^ (2 . 6) 

If, however, the universe were not normal and had, in fad , a variance M 2 with /f 2 4=3, the 
adval probability of a deviation in excess of £ in absolute value would be, not 0*05, but the 
normal probability appropriate to a unit variance deviate of On this consideration 

the actual probabilities for different values of /? 2 , where the assumed probability is 0-05, are 
shown in the fifth column of Table 1. 


Table 1 . Effect on probability of z of change in universal kurtosis , for large samples 


A 

m° 2 /m 2 

V(A#S/a/ 2 ) 

1-1*600 J(MyM. t ) 

1 

Actual 

probability 

1-5 

4 

2 

3-9200 

0-000089 

2 

2 

1-4142 

2-7718 

0-0056 

2-5 

1-3338 

1-1547 

2-2032 

0-024 

3 

1 

1 I ! 

1 9600 

0050 

3 -r> 

0-8000 ! 

! 0-8944 i 

1-7530 

1 0-080 

4 

0-6007 j 

| 0*8165 j 

1 1-6003 

0-110 

4-5 

0-5714 

! 0-7559 

! 1-4816 

! 0-138 

5 

0-5000 

; 0-7071 

i 1-3859 

! 0-106 

5-5 

0-4444 

0-0607 

1-3065 

0-191 

6 

0-4000 

0-6325 

| 1*2397 

0-215 


The table shows that, if the universe from which the samples are drawn has ~ 0, the 
true probability is about 1 in 5 instead of the assumed 1 in 20. It is, of course, true that 
universes with so large a kurtosis are unusual. This view cannot be held of the range 2-5-4 
for /i 2 in which the probability, assumed to be 0 05, can be anything, in fact, from 0-024 
to 0-110. Accordingly, if universal kurtosis is markedly negative, use of the standard table 
masks significant differences; if kurtosis is positive the standard table exaggerates these 
differences. Unless systematic tests have established that kurtosis is negligible the standard 
table should not be used for testing significant differences in variance. 

The foregoing analysis gives a theoretical explanation of the striking experimental results 
of E. S. Pearson (1931 b) working, however, with a test function 

* - 'i (*, - { .£ (-r, - r)« + ^ (y, - <y) 2 j 

and with sample sizes »' = ft and n" 20. smaller than those contemplated in the present 
analysis. With 500 samples Pearson showed that when the frequency at the two tails 
together expected from normal theory was 15-4 ( = probability 0-0308) the frequencies 
actually found in symmetrical universes with /? 2 = 2-5, 4-1 and 7-1 respectively were 7, 39 
and 47, equivalent to probabilities of 0-014, 0-078 and 0-094. 



212 Testing for normality 

If tests of normality indicate universal kurtosis, either of two courses might be adopted: 

(i) Assume that z is normally distributed with variance M 2 computed from (2*2) with 
(/? 2 -3) estimated as kjk\ from the sample, k 2 and k t being R. A. Fisher’s (1929) cumulant 
functions. 

(ii) Enter the standard table, not with z computed from the samples but with z <J(Ml/M 2 ), 
estimating M 2 as in (i). 

Both of these procedures are, of course, open to the objection that, unless the samples 
are extremely large the estimate of /3 2 is unlikely to be accurate; the real [i 2 might be larger 
or smaller than the estimate. Any probabilistic inferences should accordingly be accepted 
with reserve. 

It is fortunate that the condition specified in the foregoing paragraphs, namely, that the 
numbers in the two samples are both large, rarely applies in practical applications. It more 
usually happens that the number of classes is small, whereas the number per class is relatively 
large. In this case E. S. Pearson (19316) has shown the first approximation to<r* is independent 
of /? 2 , from which he inferred that the actual probability when the total number of samples 
was large was inconsiderably influenced by kurtosis. In view of the foregoing analysis it 
seemed to the writer desirable to carry the inquiry a stage further. 

Suppose, then, that k samples are drawn at random from the same universe, n j in the jth 

sample, the total £ n i = n. It is assumed that n is so large that terms in n ~ 2 are negligible, 
i 

that the number of samples k is small, and that all the Uj are of the same order of magnitude 
as n, i.e. that if * 

Uj = njU, £ Tt i = 1, (2-6) 

i 

none of the 7Tj is negligibly small. 

Using R. A. Fisher’s cumulant notation with subscript to indicate the sample from which 
the cumulants were computed, the mean for the jth sample is written k }j and its variance 


k 2j . Then 

>1 ^ 
3 d 

— 

II 

where 

(k-l)X = S^iy-A-,) 2 = £ W 

j 

so that 

^ A' = s *,( l -7Tj) 4J, 

n j <r j > 

and 

(n-k)Y = Z(nj-\)k 2p 

i 

so that 

ii 

where 

. n f -l 

n-l- 


Without loss of generality let the universal mean be zero and the variance unity. It may 
easily be shown that = EY = 1 

Set W = J = f I-f) = {! + <* • - 1)} {1 + ( Y ~ 1 

w = {l + (X-l)}{l-(Y-l) + (Y-l)*-(Y-l)>+...}, | 

w 2 = {l + (Z-l)} 2 {l- 2 (F-l) + 3(y-l)*-4(y-l) 3 +...}.j 


Then 


(2-8) 



R. C. Geary 


213 


We shall compute the approximate values of Ew and Ew 2 , i.e. the values to order » -1 ; the 
symbol ^ denotes ‘equal to, to approximation required’. From values of the variances 
and covariances given by E. S. Pearson (19316) in his equations (9)-(ll), we have 


mx - !)• - r !- 1 + , D^-ry,, 




E(X-l)(Y-l) 

E(Y- l)*^ 4 ^?, 
n 


n 


(2-9) 


with 

We require 



a r - SttJ. 
j 



+ 2 S S n j n y{ l-n J — n i ' + 37^7^) - 4 2 2 n i n j .n j ..(\ - Snj) k^kyhy, 

if if f 


Y - 1 = Z<j>j(ky - 1 ) = Zfa k 2J , say, 
remembering that, by definition of cumulants, 

Ek 2J - A a = 1 . 

Also (Y- l) 2 = + 2 

j>y 

It will be useful for what follows to note that 


Using R. A. Fisher’s formulae (1929) for formation of joint semi-invariants of k t and k 2 , 
and noting that the k samples are independent, we find from the foregoing 


n(k - 1 ) EX( Y - 1 ) 2 ^ (k - 1 ) (A 4 + 2), 
n(k - 1 )* EX*( Y - 1 ) ^ 2 (A* - 1 ) A 4 , 
n(k— 1 ) 2 EX 2 ( Y - 1 ) 2 ^ (jfc* - 1 ) (A 4 + 2)., 


( 210 ) 


Then, from (2-8), (2-9), (2-10), 


Ew— 1 + -- , 

n 

1-4- J 1 

Ew *~ jfTI + ndfc^T) 2 {6(P - 1 ) ■ - (* 2 + 2k - 2 - *_,) A 4 }. 


( 211 ) 


These are the formulae required. It will be noted 

(i) that the terms free of n _1 are independent of A 4 , which is equivalent to E. S. Pearson’s 
result (19316); 

(ii) that the formulae (2-11) agree with the normal values 



to n -1 when A 4 = 0; 

(iii) the approximations at (2- 1 1 ) are free of A s . 



( 2 - 12 ) 



214 


Testing for normality 

The approximations at (2*11) tend to confirm E. S. Pearson’s result that, when n is large 
compared with k, the effect of universal kurtosis is unimportant. It would be useful, however, 
to compute the approximate true probability for different values of k, n, A 4 and a_ v For this 
and for subsequent work the following lemma* will be found useful: 

///(*) and <}>{x) are two frequency densities with semi-invariants L m and L' m (m — 1,2,...), 
respectively, then, formally, 


f(x) = exp 2 


I - (L m - L' m ) 


L-i rn\ 

l dx) } 




(2-13) 


For the present application take as generating function <f> the frequency distribution of w 
in the normal case, i.e. 

I" “It I * » 

(k — l)wl~ Wn - l) 


<j){w) = 


In — 2d 

\ 2 j 

III 

fizil 

[n — kj 


/fc-3) 

\ 2 J 

Hi 

( n — k 
, 2 



«**-»{ 1 + 


and, from (2-11), 


L,-L^- 


(n — k) 

(k i + 2k — 2 — a_!)A 4 


(2-14) 


(2-15) 


n(k — 1 ) s 

Assume that L m — L' m ^= 0 (m+2). 

Then if the ‘normal theory’ probability corresponding to the sample value w be />, the 
approximate ‘ true ’ probability, subject to (2-15), will be about (p +p'). where p‘ is given by 


p’ = L ’ 2) r<j)"(w)dw = - {L * 

- J W & 


(2- 16) 


The term p\ of course, merely corrects for the non-normal term in n 1 in the variance of 2 ; 
it takes no account of corrections due to terms of higher (negative) orders in n or even of 
non-normal terms in n 1 in semi-invariants L m (ni > 2). The calculation is designed merely 
to show whether the standard table probability requires correction for universal kurtosis; 
this will appear if p 9 is of the order of magnitude of p. 

(b) The t-tcM 

In Geary's 1936 paper the expansion to terms in n~ 2 of the first four moments of t , w here 

t = nUJIcl ( 2 * 1 7 ) 

were given. Following are the first six semi-invariants L of t to the same approximation as 
in the earlier paper: 


i {2' + I In (2As " 2As + 5A3A4) I + • • •’ 


nr 


L *=* 1 + 1(8 + 7 A|) n -1 + (6 - 2A 4 - |Af - A s + V£A§A 4 ) n~* 

L s ^ - 2A 3 n * - (9A S - 3A 6 + ^A 3 A 4 + VA|) »-*, 

- 2A 4 + 12A|) n ~ l + (54 - 18A 4 4 4A 6 4- 75A| - 63A 3 A 6 - 6Af 4- 81 A|A 4 4- 
L 6 ^~ (60A 8 - 6A 5 - 20A 3 A 4 4- 1 05A|) n~ l , 

£„=*( 240- 120A 4 4-577$A§4- 16A 6 - 210A 3 A 6 - 150A|A 4 4- 1200A|)n 2 . 

Throughout this subsection we take i _ y /x'im 

♦ Due to Charlier and termed the “Differential Series” by the Scandinavian School, 
t 1936 formula corrected. 


(2-18) 



R. C. Geary 215 

where the A^, are the semi-invariants of the parent universe. For these expressions terms in 
n~ f are neglected. They were derived from the moments (from zero) M \ of t, which were 
obtained by the method described in the 1930 paper. It will be noted that, to the approxi- 
mation used, the expressions involve only the first six semi-invariants of the parent universe. 
When the parent universe is normal all the A^ (t > 2) are zero. The magnitude of the numerical 
coefficients in the foregoing approximate expressions for the L t indicate that, when the 
universal values of the A^ , particularly those of uneven order, are not very small, the frequency 
distribution of t may differ appreciably from the classical Gosset-Fisher (1908, 1925) dis- 
tribution. 

The formal Gram-Charlier expression for the frequency of t could, of course, be written 
down at once from (2*18). It is doubtful, however, if the Gaussian can be regarded as the 
most appropriate generating function for the frequency of t because, even when the parent 
universe is normal, the semi-invariants T' 2m of the higher even orders are large for moderate 
values of n. For example, 


L'JL? = 6 /(n - 5), L'JL? = 240 l(n - 5) (n - 7). 
It is proposed to use (2*13) for finding the approximate frequency with 


m . TV , ,) - (?-~ 2 )! (. + ^)"‘7( ? i- 3 )! (»- D*. 
the Gosset-Fisher frequency. Let 



It can easily be shown that the rth derivative (in t) of T x is 


(2-19) 


(2-20) 


n r) (<;«) = (-r 


(re + r - 1)! 

(re — ])! (re — l) r 


\r-ni 


r ( r - 1 ) t r-2 + n _ r(r- l)(r - 2 )<r - 3) #r _ 4 


2.4 


with 


_ r(r-l)(r~ 2) (r - 3) (r- 4) (r - 5) ) / ‘ ^ 

2.4.6 * + -|\ 1+ n-lj 

= (»- 1) 2 ( a - 1) 3 

n + 1 ' 2 (re+ l)(re + 3)’ (w+l)(n + 3)(n + 5)’ 


( 2 - 21 ) 


Note that (2-21) assumes the Hermite form when n = oo. 

I’he theory will now be applied to particular examples using in all cases re = 10. The 
universes will be assumed to belong to the Karl Pearson system, so that (M. G. Kendall, 
1941) the values of A 5 and A, can be derived (given A 3 and A 4 ) from the following equations: 


(1 + 4j/)A 3 + 2£ =0,' 

( 1 + 5ij) A 4 4 - 3£A a + 6// = 0, 

(l + 6i/)A 6 + 4£A 4 + 24i7A 3 = 0, 

(1 + 7tf) A # -f 5£A S + 10 j/(4A 4 + 3A§) = 0.. 

From the first two equations 


(2-22) 


V = (2A 4 — 3Af)/( — 10A 4 + 12Af- 12), 

which, substituted in the first equation of (2-22), gives The values of £ and r], substituted 
in the third and fourth equations, give A & and A e . From (2-1 8), the L\ being the semi-invariants 

Biometrika 34 T , 



216 Testing for normality 

when the parent universe is normal (i.e. the values found when all the A’s are set equal to 
zer °)> * L^L'^n-i + Kyn-*, L 4 -L;^J 4 n - 1 + A>~ 2 . 

Li-L'^Jzn-' + Ktn-*,. L 6 -L ^ K 6 n ~ », • (2-23) 

L 3 - L a ^J a n-i + K a n~i, L a -L' t *± K«n~ a ., 

The J and K are the terms in the Ain (2-18). To n -2 (i.e. ignoring ra~*) the frequency generated 
from T of (2-19) is as follows: 

M = T + n-» \j,D + 1 D 2 } + n~* {-* (J, + J\) + ~ i-h + ^ ^l] 

( n3 ns 

+ (^3 + 3/j *^2 + *^l) + J20 ^ ^1*^4 + 10*4^3 + 10«/| Js) 

+ Hi (j3 ' /2+2,7l * 73a)+ m6 7)9 ) + w_8 {? (*«+^i a ’i>+£< jc * + 4 « , i*» 

ne 

+ iJ a K x + 3J| + 6 J\ J % + J\) + — (JP, + 6JjA r 5 + 20J3A3 + 15J 2 </ 4 


+ &0ffJ 2 J a + 15«/f «/ 4 -H'20«7®./ 3 ) 4- 


Z> 8 

11,520 


nio J4 \ 

+ 80J, J 8 J 4 + 80J 2 J*) + ~ - (3J|./ 4 + 4J t Jf) + /> 12 < , 


(16J 3 A 5 +10JJ + 80J 8 J 2 


./J 


31,104 


(2-24) 


with 


D h 


■(-IT-- 


To n^ 1 , (2*24) agrees with the formula given by M. S. Bartlett (1935), in which, however, 
there is a small and obvious slip in a sign. The law of formation of the numerical coefficients 
of (2*24) is evident; for instance, the numerical coefficient of D*J 2 J\ is 1/144 = 1/2! 3! 2 2!. 

P co P -t 

The integrals I and (t > 0) are found by reducing the exponent of Z> by unity, or follows: 

Jt J -X 

I*" 1 Ddt = — T, f°° D 2m dt = f D^dt = ( ‘ D 2m+i dt •= D 2m . 

J -- 00 J t J — 00 J t J - 00 

(2-25) 

In normal theory the upper and lower 2| % points of l are ± 2-262 for n = 10. Table 2 
shows the ‘true’ probabilities, i.e. the value of 


f~ 2-26 /(<)<** (2-26) 

J — 00 

for parent universes specified by A 3 , A 4 , using (2-24). 

There are two observations to be made on the results presented in this table. The first is 
that, despite the considerable number of terms (shown at (2-24)) included in the probability 
expansion, the values found in the successive terms cannot be regarded as satisfactorily 
convergent for so small a sample as 10, and, of course, the convergence disimproves with 
increasing y//J v Taken all together, however, they seem consistent and significant. The second 
observation is that attention was confined to the negative ‘tail’ of the distribution. It may 
be assumed that, in all cases, the distortion would be very considerably less marked if 
regard were had to the probability for 1 1 1 > 2-262. Actually for universe 3 the probability 



R. C. Geaby 217 

is 0*056, not significantly different from the normal theory probability of 0*05. In justifica- 
tion of the attitude adopted above, the point might be put as follows: 

We decide to accept the hypothesis that the universal mean is zero provided that the value 
of t found from the particular sample satisfies t 0 < t < t v where 

Prob (i < t 0 ) = Prob (t > t x ) = 0*025. 

The table is designed to show that if the parent universe is markedly asymmetrical the 
range (J 0 >*i) may differ appreciably from -t 0 — t x = 2*262. 


Table 2. Probabilities of t less than — 2*262 for samples of 10 for seven universes 


Universe 

3 

it 

Cl 

< 

A 4 =/?,-3 

Probability 

Normal 

0 

0 

0025 

2 

0 

1 

0024 

3 

1/2 

0 

0041 

4 

1A/2 

1/2 

0047 

5 

1 

0 

0*072? 

6 

1 

1 

0*086? 

7 

1/2 

1/2 

0*043 

: 


As anticipated by earlier work (W. 8. Gosset, 1908; R. 0. Geary, 1936), the table shows 
that the distortion is slight for symmetrical universes; even when A 4 = 1 (and A 3 = 0) the 
probability (0*024) is practically identical with the normal value. There can be little doubt 
that the standard table probabilities can be seriously at variance with the true probabilities 
when the universes from which the samples are drawn are markedly asymmetrical. 


(r) Difference of means 
R. A. Fisher s (1925) test of significance 

/ _ (&i — kj) y/(n 4 n — 2) j n n ^2*271 

{(»'- \)k' 2 +\nr -\)k n z y (n' + n")' K } 

for the difference of averages k\ and k x in normal theory for random samples numbering n' 
ind is, of course, a particular case of the analysis of variance considered in §(a) above. 
The second cumulants are k' 2 and k It is assumed that the unknown universal means and 
variances are equal. Suppose now that the random samples in reality have been derived 
from universes in which the means are equal but the other semi-invariants AJ and AJ are not 
lecessarily zero for i > 2, or even necessarily equal. Since the universal means are assumed 
3qual, without loss of generality we may take A^ = AJ = 0. This general mathematical model 
*eems to be the correct one; we are not trying to determine the probability of the samples 
seing derived from the same universe but rather if they could conceivably have been drawn 
Tom universes with the same arithmetic mean , however much they may differ otherwise. 
The correctness or otherwise of the concept may be considered in relation to, say, the 
problem of deciding from two random samples which of two types of fertilizer is to be pre- 
erred from yield observations on a given crop on a given kind of land. Undoubtedly the 
3rime problem will be that of ascertaining which is probably the better yielding (i.e. whether 
)he arithmetic means are significantly different). Of considerably less importance is the 


13-2 



218 Testing for normality 

question of which fertilizer is the more variable; of less importance still is the question of 
deciding, say, whether with approximately equal yields one universe is symmetrical and 
the other markedly asymmetrical. The point is that the question of the equality of universal 
means should be considered without assuming that the other semi-invariants in the universes 
from which the samples have been drawn are necessarily equal. This essentially is also the 
viewpoint in R. A. Fisher’s randomization method. 

Expanding the denominator of (2-27) in terms of (k 2 - X' 2 ) and (&£ — A 3 ) and computing 
therefrom the first few terms of the first four moments of t, we find the following approxima- 
tions to the first four serai -in variants: 


with 


AL^- 


iK-K) 

2(n% + n"X;y 


A 2 L, 




2 »'A;*4-«"A; 2 \ 
ii'K + n"K ) 


A*L, 



, (n’ 2 — n" 2 ) (A^Aj- 

AjAi) 


n'n"{n'\ 2 -f 

n"A ") 2 

a 3 a; 

3(A 3 — A 3 ) / 

'A, 

,A?\ 

n' 2 n" 2 

(n'Aj ■fn'AJ) \ 

y‘ 


6 (w'Ai a + w"A^) /A; 

| 2 _ 

6 /A3. 

~ (n'Ag + 

n"\” 2 ) 2 \»' n'J 

1 

U ' 2 

, 18(A'~ 

K ) 2 M' a;\ 

a; 

, a; 

(w'Ai + n 

*A 3 ) 2 U «7 

n' 3 

n" 3 

_ 3 P 2 , AS 

\ {X' t (n'n"A 2 + 2 w,* 2 - 

-n % X\ 

V n* 

7 " 

" n'n"(n' 


»\2 


JHK-K) 

A(n'A' 2 + n"\l) 


(A s — A a ) 


(n'A'+n"A') 


■n' 2 K)) 


(2-28) 


, ( (n'_+ n"\ (\' 2 n' - 1 + AJ n" - 1 )| 1 

|\ n'n" f (»' + »" — 2) j 


Using formula (2-24) to the term in n ~ l with the (fosset-Fisher function again as generating 
function, Table 3 shows rough approximations, for four examples, to the ‘ true ’ probability 
of values of t < r, where r is the (negative) value for probability 0-025 from the normal table, 
and X' 2 — \ 2 = 1 . When the two samples are drawn from different universes the distortion 
can accordingly be considerable. The third example suggests that if the universes are the 
same the distortion is small, a result to be anticipated from the fact (apparent from (2-28)) 
that, to the approximation used, the first two semi-invariants are equal to their normal 
theory values; this theory confirms the experimental results of E. S. Pearson & N. K. Adyan- 
thaya (1929). 


Table 3 


Example 

n' 

n* 

^8 

As' 

A< 


Probability 

1 

12 

4 

1 

-1 

1 

-1 

0045 

2 

18 

6 

l 

-1 

1 

-1 

0041 

3 

7 

4 

1/V2 

1/V2 

1/2 

1/2 

0027 

4 

10 

6 

1 

0 

i 

1 

0 

1 

0-038 



R. C. Geary 


219 


It should be remarked that the probabilities in Table 3 (as well as in Table 2) are merely 
rough approximations — the samples used are far too small for the results to have any preten- 
sion to acouracy. The object has been merely to show that the actual probability could be 
considerably at variance with that shown in the standard table, for small samples. 


3. Sufficient conditions foe approach to normality of a(c) with increasing n 

The remainder of the paper deals with the field of symmetrical tests of normality, homo- 
geneous of degree zero, represented by (3-1). It is essential to establish the conditions of 
approach to normality of the frequency distribution of a(c) as the sample number increases. 

Let a(c) = J £ (*,-*)»)*, (3-1) 

» i-1 / l»<-i ) 


(3-2) 


where x = Zxjn and c is non-negative. It will be shown in succession that, subject to stated 
conditions, with increasing n, 

(i) the frequency distribution of 

<!*/{; H" 

tends towards normality, and 

(ii) the frequency distribution of a x (c) tends towards that of a(c) and hence towards 
normality. 

It is assumed, without loss of generality, that the universal mean of the universe from 
which the sample of n is drawn is zero. Denote the kth absolute moment from zero by 
k not being necessarily an integer. Given a positive quantity e arbitrarily small, <o(e) can be 


found so that 


Prob [ji Z(\ x, /^} > 1 -e, 


Prob 


-Z(x\-ft 2 ) 

Jt 


<(0 


\ 


-/ 4 ) 


> 1 — €, 


(3*3) 

(3-4) 


provided, of course, that fi m and /i A exist. As n increases^ may be envisaged as approaching 
the normal probability point appropriate to the probability e, since, in the conditions stated, 
£ | x t | c jn and Ex\ jn are normally distributed in the limit. For samples which satisfy the 
inequality in the brackets { } at (3-4) and if n is so large that 


(i) 



</**> 


the denominator of (3*2) can be expanded to three terms (including the remainder) by 
Taylor’s theorem, so that a^c) may be written 

«.<«> - *.*-•■{> +1 +;^<)4 < 3 ' 5 > 

y { = (|*<| C -/*| C ,)K|, 
z { = (xS-/i t )l/i a , 

1 $ WRc+4) 


(O<0< 1). 



220 T eating for normality 

With probability exceeding (1 — e) it is evident, from. (3-4), that X is maximized by 

/M~ Mc+4) 

\ M *»/ 

It will suffice, for the present purpose, to infer that 

\X\<K, 

where k is a oonstant independent of n. We have now 


Set 




_ i s±*i4. c -V.4_/f _ i Y\ 

«Ufci Mi 4 v«i \2 /)’ 

(3-0) 

and 

1 //4 c ®i( c ) 1 r ( C A- U 

(3-7) 

with 


(3-8) 


For samples which satisfy the inequalities in { } at (3-3) and (3*4) and hence with a pro- 
bability exceeding (1 — 2e), we have 


. ■ c(c + 2)KOJ 2 fl i / | (O //l m \ Z 

' ' 2cr nfi\ c] /i 2 Scr nft\ \ //, C |V n ) ' \jn' 

where £ is independent of n. Or, briefly, 


Prob 



> 


1 - 26 , 


(3- 10) 


so that u tends in probability towards zero with 1 fn. Now (3-7) may be written in the form 
u = Y' —Y, where Y ' and Y are the respective terms on the left side. If A be any number 
and F the total probability function, a well-known lemma (Fr6chet, 1937, p. 1 64) shows that 



l A+ i)- F r{ A ~i) ) +2e - 

(311) 

using (3*10). Hence the frequency distribution of 


y, _ 1 MJWc) j\ 

(312) 

cr \ 

/*lcl / 

tends towards that of Y = — l'(y ; — ^ z.- ) 

TUT \ 9t 2 7 

(313) 


at every continuity point of the latter frequency, as n tends towards infinity. But Y, from 
(3*13), is the simple average of n random measures, and its frequency must tend towards 
normality provided that its standard deviation exists; from (3-6) it is evident that cr is 
finite provided that /i m , where k is the greater of 2c and 4, is finite. Here and in the remainder 
of this section it will be useful to remember that if /i m exists so does for Q^k' < fc. 



R. C. Geary 221 

To prove that the frequency distribution of o(c) tends towards that of a x {c) and hence 
towards normality with increasing n it will be shown that E\x t -x | •/» tends in probability 
towards E | x { \ c jn. Two cases will be considered separately: (1) c > 1, (2) 1 > c > 0. 

Case (1). c> 1 

For values of x { for which \x i \^\x\, 

I * | c — \-%i | c = + cz|*<-0x|«-i (O<0<1) 
and when |*,|<|*|, \\x i -x\ c -\x i \°\ £(2 c +l)|x| c . 

1 I n I /» n \ 

!*l(«JU* < l c ~ 1+ °l*H’ 


Hence 


n 


(l*<-*|) c -(!*<| c ) 


(3-14) 


£ and C being independent of the x^ and n but depending on c. With e arbitrarily amA.ll 
a) can be foimd so that 


Hi'K-ysK* 

Prob j |; £ (l l~ [ <« 


(3-15) 


Hence, from (314) and (315), if and /a I 2c _ 21 exist, 


Prob 




<B ^ id) >1-8. 

Vn j 


|n 1 w 

for n sufficiently large the constant B' depending on c but not on n . Hence for c^l, 
~ I X x ~~ x \ c l n tends in probability towards 2J\x i \ c jn. Incidentally, this proves that 
{l(Xi — x) 2 /n}* c tends in probability towards {Zx% /w}* c , the latter two expressions representing 
respectively the denominators of a(c) and a 1 (c). 

Case (2). l>c^0 

Let x satisfy a probabilistic inequality identical in form with the first equation of (3*15) 
and let y be any positive quantity, fixed once for all. Let n (presently to be defined further) 


(3-16) 


be so large that 


Then 



When j x { | 

>y (i.e. in E‘), 



1 x i -xY-\x i 

| r = ± cx | x-i-dx | e_1 (0 < 0< 1), 

so that 

Prob 1 1 |x £ — x | c — | 



(3-17) 

When | x - 1 < y (i.e. in -T"), given tj arbitrarily small and positive, n can be found so that 

(3*18) 

when \x\<(o 


* I* 

v » 


since | x | c (c > 0) is uniformly continuous in E". We then have 

Prob {| | x { — x | c — | x { | c | < 7)} > 1 — e. 


( 319 ) 



222 


Testing for normality 


Combining (3-17) and (3-19), it may be inferred that 


Prob 





(3-20) 


the first term of the upper limit in { } tending to zero as n tends towards infinity, and e and y 
being arbitrarily small 

We have accordingly shown that the numerator and denominator of a(c) tends in pro- 
bability towards those of a x (c). Hence a(c) tends in probability towards a x (c). Hence, using 
the lemma cited at (3* 1 1 ), the total frequency of a(c) tends towards that of a 1 (c) which tends 
towards normality as n tends towards infinity. Finally: 

If c ^ 0 the frequency distribution of a(c ), given by (3*1), tends towards normality as n tends 
towards infinity provided that p [k ,, where k is the greater of 2c and 4, is finite. 

It seems likely that an analogous theorem can be proved for 0 > c > — £ ; we shall not, how- 
ever, be concerned in this communication with negative values of r. 


4. Moments of a(c) for normal samples 

While it will be shown in later sections that, with indefinitely large samples, Jb x and b 2 are 
the most efficient tests of asymmetry and kurtosis, respectively, it by no means follows that 
other tests are inefficient or that they may not be useful supplements in cases in which the 
prime tests are indecisive as to the probable non-normality of a given sample. It is accord- 
ingly proposed to give here close approximations to the first four moments (from the origin) 
of a(c) (given by (3*1)) for normal random samples of to. 

For normal samples (R. A. Fisher, 1929; R. C. Geary. 1933) 

M' k {a(c)} = E{a(c)Y = E^ £\ x t -x \ e f / .£ (*,-*)*]"* • (4-1) 

The exact value of the denominator is, of course, known, for 


<4 ' 2) 


since, as usual, (n ~ 1)« 2 ~ 2^(^ — a:) 2 . It will be useful to expand log,, Es k ‘ with k ' = ck using 
Stirling’s formula in (4-2): 


log* Es* = k - log -■ 2 _ j + log + 2 - -) ! - log 1 

^(k’ 2 -2k') k'(k'-l)(k'-2) k'\k'-2)* k'(k'~ l)(k,'-2)(3k' a -6k'-4) 

4(» — 1 ) 12(n — I) 2 + 24(n-l) 8 120(n-l) 4 


k' 2 (k' - 2) 2 ( k '* -2k' -2) k'(k' - l)(k'-2) (3 k’* - 12 k'* + 24 k' + 16) 
+ 60(» — l) 8 252(71-1/ 

kyc' - 2) a (3 k'* - 12k' 3 - 4 k' 2 + 32 k' + 32) 


336(n- l) 7 

which checks for k' = 1 to («— 1)~ 7 with Geary (1935, p. 354). Take 

w. 


(4-3) 


( 4 - 4 ) 


with 


z { — x t —x. 



R. C. Geary 


223 


The moments of v(c) will be found exactly as in the case of c - 1 (Geary, 1936) from the 
single or joint normal frequency distributions of (z v z t , ...). We find 






1 «*-»?(Sz!)i r ? .• -^-ir (.-««« {^)ij 


(4-6) 


(4-6) 

For the third moment we write 

mm) - = ^a £ l z il 3c + -~F~ ^I 2 iN 2 * I c + --- i w T— 2) E I *i | c I 2 * N *a | c 

= Ai *f A 2 + A 3 > (4*7) 

denoting the three terms on the right by A v A 2 , A 3 respectively. Then 


i<4+3r) 


J (2c+ \)(c+\) (2c + 3)(2c+l)(c + 3)(c+l) \ 

\ l+ 2!(n — i) 2 + 4!(» — l) 4 + 

[fiW 




2«\j 
77 


(n - 3) W** (n - 2)-« 3c +« (w - 1 ) n~* 


3 (c + 1 ) 2 
2 (n — 2)* 


(c + 1) 3 (c + l) a (c + 3)(7c + 9) (c + 3) 2 (c + 1 ) 3 
(n-2) 3 + 8(n — 2) 4 * 2(n-2) 6 

(c + 3) 2 (c+l)M61c 2 + 310c + 265) ( 

+ 240(» — 2 j 8 + "'f* 

Similarly, for the fourth moment , 

M'Moi - E{«(c)}> = ~E\z, - 1 * l*il“ I >, |‘ 


+ WL-!) jr|« 1 |*'|z,|*+ -Jp=£ *|,.rKM^| 


n(n— 1)(» — 2)(» — 3) 


with 


+ ' - —Zi- ’ E I 2 1 | C I 2 * N 2 3 M 2 4 I 

tv 

= c\+c 2 +c 9 +c t +c & 


( 


x 1 + 


(3c + l)(c + 1) (3c + 3)(3c + l)(c + 3)(c + 1) 


(4-8) 


2!(n— l) 2 


4!(n — l) 4 





224 


Testing for normality 


Q 2&Ml 

a = — - v— (n - 3)^ (* - 2)~W*+» (i 

7T» 


( (c + l)(5c + 3) (c + 1 ) 2 (2c + 1 ) (c+l)(57c 3 + 227c 2 + 255c + 81) 

\ 1+ 2(n — 2) 2 (» — 2) 3 + 24(w-2) 4 


(2c + l)(c + l) 2 (c + 3)(5c + 9) 

6 (» — 2) 8 




6' = ^ (» - 4)« 4c + 5 > (n - 3) -20 ' 1 (n - 2) (n - 1 ) 

7T* 


( 3(c+l) 2 4(c+l) 3 (c+l) 2 (7c 2 + 21c+15) 4(c + 3)(c + l)*(2c + 3) 

+ ( 1 + '(n-3) 2 (n-3) 3+ (/< - 3) 4 (n-3) 8 


(c + 3) (c+ l) 2 (122c 3 + 671c 2 + 1070c + 525) 


15(w — 3) 6 


Formulae (4*5), (4*6), (4*7) and (4*8) were checked from the corresponding formulae for 
c = 1 given in the author’s 1936 paper. 

From the following section it will be apparent that for indefinitely large samples the most 
sensitive test of kurtosis of the field a(c) is found for c = 4. At the same time it is shown that 
there is really not much difference in efficiency for values of c in the range 5 ^ c > 2 ; moreover, 
the results in § 6 (in which the efficiency of the tests for c ’ — 4 and c = 1 are compared from 
the power function viewpoint) suggest that, for samples of moderate size, the superiority, 
if any at all, of a test using a(4) = b 2 over other tests in the series may be even less marked. 
The disadvantage of a(4) is that its frequency is not known for samples of all sizes; and if we 
could estimate, with any degree of confidence, the probability points of a{c) for any value or 
values of c > 2 for medium-size samples we might, for practical purposes, dispense with a(4) 
altogether, since, while we now know one way of solving the problem of determining the 
exact, or almost exact, frequency distribution of a(4), it must be admitted that the method 
is extremely tedious. (From the theoretical point of view 7 , however, the a(4) problem must 
be solved since it remains a challenge to the mathematical skill of statisticians!) It will 
accordingly be of interest to study the order of magnitude of the semi -invariants of a(c) 
for c near 2. 

Consider the case, for example, of c = 2*4, not by any means, it is important to 
observe, the lowest value which would be used for tabulating. In Table 4 the first three 
moments are given for n = 25. The U s represent, of course, the semi-invariants. The values 
of the functions for a x (c) (given by (3-2)) for n = 24 (i.e. the appropriate number of degrees 
of freedom for comparison with a(c)) are also given. These show that the moments of a x (c) 
are very close to those of a(c), which suggests that, when n is not less than, say, 20, the values 
of B v B 2 and corresponding functions of higher orders, if required, for a x (c) could be used 
for the determination of the probability points of a(c). This is important from the com- 
putational point of view because the algebraic expressions for the normal moments of a t (c) 
are exceedingly simple whereas it must be conceded that (4*8) offers a grim prospect for the 
computer; furthermore, the principal terra C s is rather slowly convergent unless n > 50 or so, 



R. C. Geary 225 

whereas exact values for all values of n can readily be found for the moments of a r (c) for normal 
samples. 


Table 4. Normal moments , etc., ofa{c) and a x (c)for c = 2-4 



a( 2-4) 

a i(2-4) 

n 

25 

24 

4 

II 

¥ 

1100252 

1*1002524891 

Mi 

1*302004 

1*302091180 

Mi 

1*592841 

1*593151015 

M t = L t 

0*001800 

0*001940318 

M t = L, 

0*000003 

0*000009583 

^B X = LJL j 

0*80 

0*8104 


As with (4*1) for a(c), the moments (from the origin) of any order of a t (c) is the quotient 
of the moments of the same order for numerator and denominator, assuming that the 
universal mean is zero and the variance unity. Since the different members x i of the sample 
are independent — the difficulty with a(c) is that the (x { — x) are not independent — for the 
moments of the numerator of (3*2) we require only 


E | a- 1*' = f 
and for the denominator 


dxx k e ^ = 


V(27t)J 0 

* r -*or-cn 


2 \l k ' in + k' 
*> 



(4*9) 

(4*10) 


The case of c = 4 is particularly simple. The first four semi -in variants are as follows: 


t _ w _ 24w 2 («—lJ 

2 (w + 2) 2 (» + 4) (n + 6) ’ 

r = ju — 1728(n-l)'(»-2)*> 

^ 3 ' 3 (» + 2) 3 (» + 4)(» + 6)(n + 8)(» + 10)* 

10,368» 4 (n - 1 ) (30» 4 + lC8n 3 - «08n 2 -2672n + 3712) 

4 (ra + 2) 4 (» + 4) 2 (» + 6) 2 (» + 8) (» + 10) (n +12) (« + 14) * 


(4-11) 


Moments, etc., for aj(c) for normal samples of 24 and 50 are contrasted for c = 2-4 and 
c = 4 in Table 5. The contrast between the values of y JB 1 and (B 2 - 3) respectively for 
o x (2-4) and a x (4) is striking in the extreme. Even for n — 24 ^B 1 [o x (2-4)] and B 2 [a,(2-4)J 
are approaching the values at which a Gram-Charlier approximation to the frequency 
distribution may be reasonably convergent. Furthermore, the decline in the values of the 
jB’s from n = 24 to n = 50 is marked for o 1 (2-4), while the decline in the fi[Oj(4)] is very slow. 

It is accordingly suggested that a table of probability points (perhaps 0-001, 0-01, 0-025, 
0-05 and 0-10) of o(c), for c equal to, say, 2-2, be prepared for n ^ 25 on the assumption that 
Gram-Charlier applies throughout. For this purpose the values of the mean and variance 
for n at intervals of, say, 10 should be computed from formulae (4-5) and (4-6); the B x and 
(B 2 — 3) should, however, be computed as for a x (c). For lower sample sizes it might be well 



226 Testing for normality 

to use terms to order n~ 2 which would render necessary the use of the fifth and sixth semi- 
invariants of a^c). The formulae given by E. A. Cornish & R. A. Fisher (1937) (assuming 
Gram-Charlier) could be used to find the probability points. On account of the minuteness 
of the variance L 2 for c near 2 it will be necessary to work to many places of decimals — at 
least 10. As stated at the outset, the test of kurtosis a(2*2) will be only slightly less efficient 
than a( 4) and it may be slightly more efficient than a( 1 ), the probability points of which are 
known approximately for samples of all sizes. In any case the a(2-2) table would be a useful 
adjunct to that of a(l). 


Table 5. Normal moments , etc., of a x (c) for c ~ 2*4 and c = 4 



n = 24 

w-50 


c = 2*4 

c=4 

c= 2-4 

C = 4 

M[ = L X 

i 1062524891 

2*769231 

1 1721603127 

2*884615 

M t =L t 

0*001946318 

0559932 

0*001058402 

0*359560 

M„ = L a 

<>000069583 

0*752488 

0*000022251 

0*343337 

L t = M t -3l4 i 

0*000004921 

1*955999 

<>000000919 

0*711375 

t 

0*8104 

1*7960 

06462 

1*5925 

B t -2 = LJLi 

1*30 

6*24 

0-82 

5*50 


In an earlier paper (1935) the writer suggested that the correlation between b 2 and a(l) 
for normal samples gave some indication of the relative efficiency of these two tests of 
normality. In this order of ideas it seems desirable to compute the approximate value of 
the correlation coefficient between a(c) and a(c'), where c and c' are any two positive con- 
stants. In the first instance the universe from which the sample of n was drawn was not 
necessarily normal. Since in the present application we will be concerned only with large 
samples we assume the universal mean known (and accordingly it may be taken as zero, 
i.e. = 0), so that, instead of a(c) we use, in reality, a x (c) given by (3*2). In the remainder 
of this section we write a for a x (c) and a' for aj(c') : 


« = (4-12) 

-e^r)/e^r. o>»> 

Set iji = (\x { \ c -/i M )l/i M , 

y'i = (Kl c '-/vi)//Vi. 

z i = (x\-n t )jn t , * 

a = /*i J/T> a' = /Vi 

,, c + c' „ C(C+l)(C + 2)...{C + k-l) 

C= ~2~'’ 6 * k\ •) 


aa' 

aa' 


( ,+ H( ,+ 5«)( ,+ H 


Hc+S> 


(4-15) 


Then 



R. C. Geary 


227 


The mean value of aa'/oia' was found approximately (i.e. to terms in n ~ *) by formally 
expanding the last factor in (4- 15), multiplying by the first two factors, and setting down 
the mean value term by term, so that 

M'Jota' = Eaa'jajx' = [ 1 + ^ C\nEz 2 - ± C a nEz 3 

+ i C\ (nEz* + 6w ^~ 1 E 2 z 2 j - i C 6 (l(ynlir^l Ez 3 Ez 2 ) 

+ i? 8 z 2 j + { - ^ (nEyz + rctfy'z) + ^ (n Eyz 2 + 

— [n(Eyz 3 + Ey'z 3 ) + 3ra n — 1 Ez\Eyz 4- Ey'z)] 

n* 


+ [4» n — 1 Ez 3 (Eyz + Ey'z) + On n — 1 Ez\Eyz 2 + 22y'z 2 )] 

71 


306* n n — 1 n — 2 


E 2 z 3 (Eyz + %'z)l + nEyy' - nEyy'z 


+ [nEyy'z 2 + 2nn—l Eyz Ey'z + nn — 1 .filyy ' ^z 2 J 


f [« n — 1 Eyy' Ez 3 + 3nn—l ( Eyz 2 Ey'z + Ey'z 2 Eyz + Eyy'zEz 2 )] 


+ C j Eyy' Eh 2 + 1 2n ra - 1 «. - 2 %z#y'z£’z 2 J] . (4-16) 

The 15?’s in (4-16) are readily calculable from (4-14), e.g. 

%</' = = #(| -Kf | c - / tt|„ i )(| | b '-Am)/Am/Vi = (/Wi/Ai C |/Vi)~ 1- 

It has been verified that when c is substituted for c ' in (413) the formula agrees with that 
for the second moment of a x (c) given in § 6. 

The coefficient of correlation is, of course, 

Her - (4*17) 

with M rr . - 

Formulae for the first and second moments, to the approximation required, for the com- 
putation of (4-17) are given in § 6. 

As an application, the following are the values of the variances and the covariance for the 
test of normality a(l) and (6 2 ), i.e. in which c and c' have respectively the values 1 and 4, 
and where the universe belongs to the Pearson system with A a = 1, A 3 = 0 and A 4 = £: 


Ko _ 0-09313705 _ 0-262961 _ 0-196477 
/tf cl n n 2 n 3 

4-4286 92-25 8 31-2 

Am ~ n »*" + » 3 ’ 

My. __0-49j 4-87 281 -5 

AiciAio'i ~ « + n 2 n 3 


(4-18) 



228 


Testing for normality 


From (4*17) and (4-18), R^(n^ 100)^ -0*826 and R^n^co) = -0*764. It is of great 
interest to find that, though the universe is markedly non-normal the correlation for in- 
definitely large samples is practically identical with the normal theory value of —0*767 
(Geary, 1935), another indication, no doubt, that normal theory inferences can usually be 
applied with confidence when the parent universe is not markedly unsymmetrical. 

When samples are indefinitely large we find, from (4-16) and (4*17), 


Z> _ ^Ic+C'l ~ 2 (C/*, c| /V +2 | + c 7Vl/^lc+2l) + ( CC 7*4 — C — 2 . c' — 2) 1 o\ 

~ ■ “ ~ VWM “ ~ ’ ( } 


where, of course, the values to be taken here for and M && are found by substituting 
respectively c' for c and c for c' in the numerator. When, in addition, the parent universe is 
normal, we find 



which reduces to — 1 /VI ^ 2(zr — 3)} for c = 1, e' = 4, as it should (Geary, 1935). The following 
section will accord b 2 (i.e. a( 4)) a decided primacy amongst tests of normality when the 
samples are indefinitely large. It may, therefore, be of interest to give the values of the 
correlation coefficients (for indefinitely large normal samples) between b 2 and a(r) for 
selected values of c (Table 6). The table suggests, in the high coefficients of correlation, 
except for c very near 0 or 2, that all the a(c) should be reliable tests of kurtosis, with no great 
difference between their efficiencies. The efficiency of any two tests would be identical, in 
the conditions stated, if the coefficient of correlation between them w as ± 1 because then, 
of course, they would be functionally, and not stochastically, related. 


Table 6. Correlation between b 2 and a(c) for indefinitely large normal samples 


* — 

. 

— . . -- 


Value of c 

Value of 

Value of c 

Value of 

0 

0 

3 

0-980 

1 

— 0*769 

4 

1 

2 

0 

5 

0-983 

2-2 ; 

0-887 

6 

0-939 

2-5 

1 

0-952 

00 

0 


5. The most efficient tests for indefinitely large samples 

In this section we consider the efficiency of tests of kurtosis and asymmetry from the view- 
point of indefinitely large samples. * 

By definition a test will be regarded as valid, in relation to a field of continuous alternative 
universes including the normal, if its value for infinite samples drawn at random from the 
normal universe iR different from its value for infinite samples from other universes of the 
field. As the sample number increases the test will become increasingly discriminatory of 
the normal as distinct from other universes of the field. This increased sensitivity might be 
given mathematical expression in some such terms as the following: given a probability a 
(say 0*01), the normal universe W 0 of the field and any other distribution W x of the field, 



R. C. Geary 


229 


a number n x can be found so that for n~^n x the mean value of the test function for samples 
of n from W x will lie at or beyond the a probability point of the test function for samples of n 
from the smaller » x the more sensitive the test. 

We consider, then, the infinite field of alternative tests of kurtosis represented by (3-1) 
when c assumes all positive values, and the infinite field of alternative universes represented 
by the Gram-Charlier frequency 


V(2“tt) ° XP 



(5-1) 


The universal variance is assumed to be unity, without loss of generality. The normal 
universe is a member of the field: it is found when all the A* (i > 2) are zero. We assume that 
the conditions of § 2 are satisfied so that for indefinitely large samples the frequency dis- 
tribution of a(c) for all parent universes is normal. Obviously the efficiency of any particular 
test (i.e. a(c) for a particular value of c) in regard to the normal and a particular non-normal 
alternative (i.e. a Gram-Charlier frequency with particular values of the A f ) will be adjudged 
by considering the ratio of 

(i) the difference between the universal mean values of a(c) for the normal and the 
particular non -normal parent universes; to 

(ii) the standard deviation of a(c) for indefinitely large normal samples. 

The most efficient test will be a(c) for r a theoretically ascertainable function of the given 
A, which makes the ratio a maximum. 

For indefinitely large samples the mean value <f> of a(c) when the parent universe is given 
b .v(5-l)is 

* - W -''* 1 j!|fexp (?iU‘rf*) H " (5 ' 2) 


Obviously 
Also, when m > 1 , 


/ d\ 2m+1 

L'kW'rs) c ” lx,:=0 - 


p rf.r| x\ r ^-Jj 2m e ^ = p - ! 2*< r+1 >e(f — 2) (c - 4) ... (r - 2m + 2), (5-3) 


a result readily inferable from the obvious fact that the left side vanishes for c = 0, 2 

2m — 2. Accordingly 




/C-1\ 2 «r+» 

l 2 I * % 7(2 tt) 




(5-4) 


The normal value is given by the first, term. . 

From (4-3), (4-5) and (4*6) it is evident that the value of the standard deviation, for larger 
normal samples (retaining only is 




(5-5) 


The principal term in the deviation (where <f>° is the normal value), from (5-4), is 


|(c-l)!2* A 4 c(c — 2) 
yjt T ' 24 


(5-6) 



230 Testing for normality 

To a constant factor, the ratio S/a is given by the first discriminant 


P( c ) — c(c — 2) 


N^v ,+.r 


It will now be shown that 


dpjc) 

dc 




= 0 for c = 4. 


The discriminant may be written in the form 


(5-7) 



. , , ..m* c* + 2\-* 

p(c) = c(c - 2) ~j , 

(5-8) 

where 

ri” 

/„ = J cos® 0d0, 

(5-9) 

and 

P'W 1 1 4 4C\ 1 l&I* C* + 2| 

W) - « * - 2 i r~r ( ~ % ) il lx - i ■ 

(5-10) 

From (5*9) 

rt* 

J c ~l' c =z dO log 0 * 6 log cos 0 . 

in 

(6-11) 

From a fairly well-known property 



J 0 = rf01ogcos#= -£7rlog2. 

Jo 

(5-12) 


In (5-10) we shall be concerned only with even positive integer values of c. We have at once 


« 7T . 7T _ ,l7T ■ v«i — 

•H> “ 2 ’ 2 — J ’ ~ ia’ ^6 OO ’ •* 8 ~ ' 


5tt 

32 


36tt 
256 ‘ 


4’ 4 16 

/*!» ri* 

From (5* 11) ./^ = I d0 cos 26 0 log cos 0 -= I rf(sin0) cos® 0-1 ^ log cos0, 

Jo J 0 

which, by partial integration, 

f*« . — -• . . n cos'* >0 sin 0\ 

= 1 rf0sm0(2r— 1 sin0 cos* 2 0 logcos0 + — ----- ) 

Jo \ cos 0 / 

= (2c — 1 ) («/j b _2 ~ ^2c) + 4 c -2 ~ ^ac- 

Hence 2c</ iB = (2c — 1 ) - /^ + 

From (5-12), (513) and (5-14), 

«/ 0 — —\n log 2, J 6 = ( — eOrr log 2 + 37w)/384, 

./ 2 = ( - 2tt log 2 + 7 t)/ 8, J 8 = ( - 840w log 2 + 533 tt)/61 44. 

J 4 = (-127rlog2 + 77r)/64, 


(5-13) 


(5-14) 


(515) 


Noting that I & = 2J& and substituting in the right side of (5-10) the values of 7 and J given 
by (5-13) and (5-16), we findp'(4) = 0. Table 7 gives the values of the discriminant for certain 
values of c. 

The discriminant accordingly assumes a maximum value for c = 4, a result so remarkable 
that one might be inolined to suspect that it is a consequence of the form which was assumed 
for the alternative to the normal curve, a form which, in placing such emphasis on A 4 



K. C* Gbaby 


231 


high-lights, so to speak, b t (— A 4 *f 3 when A, « 1 for indefinitely large samples) as a test of 
normality. From the algebraic point of view this is anything but obvious: the property 
emergesfrom quite a complicated piece of algebra. It may also be emphasized that the field 
of alternatives (5*1) is not arbitrary; it is a general form of frequency distribution when all 
the A* are finite. Admittedly the discriminant takes account only of the term in A 4 in the 
expansion; but this is certainly the most significant term for a wide class of frequency dis- 
tributions, namely, those of homogeneous symmetrical functions of samples of n as n tends 
towards infinity under very general conditions for the parent universe, provided that the 
resulting frequency distribution can be assumed to have its third moment zero; for then the 
only term in n~ x in the frequency distribution of the function will be the term in A 4 . The 
significance of the property demonstrated must not be overstressed since it is subject to 
many qualifications, but it gives strong grounds for holding that, for very large samples, 
& 2 is the most efficient test of normality of tests of type a(c) in relation to a very extended 
class of alternative universes. At the same time Table 7 shows that there can be little 
difference in efficiency in the field a(c) for c ranging from close to 2 to about 5. There is but 
little doubt, on this showing, that b 2 is more sensitive than a(l), a conclusion suggested on 
the basis of certain experimental results by E. S. Pearson (1935) and examined from the 
viewpoint of power function theory in § 0. 


Table 7 


0 < c < 2 

1 

Discriminant 

f>(c) 

2 <G< 00 

Discriminant 

P(c) 

4-0 

\ - 2-334 

2 + 0 

4*460 

01 

-2-541 

21 

4-508 

0*2 

— 2*725 

2-5 

4*066 

0-5 

-3-188 

30 

4*801 

0-7 

-3-441 

3*9 

4*898 

10 

-3-758 

1 40 

4 900 

i 1*1 

-3-851 

41 

4*898 

1*5 

1 -4-166 

50 

4*818 

1-9 

- 4-405 

6*0 

4*602 

0 

1 

<N 

| - 4-460 

7*0 i 

4*288 


i 

8-0 

j 

3*900 


Adverting to (5-4) in conjunction with (5*5), it might be asked if, on the analogy of the 
maximal property just demonstrated for the first discriminant, the function 



has a turning point at c = 6. The answer is in the negative. The value of pi(6)//9 2 (6) is, in fact, 
15/34. At the same time there must be a zero of /? 2 (c) very near c = 6 since 
p a ( 5-9) = 8-79, p a (6) = 9*20, p 2 ( 6-1) = 8-56. 

Analogous to the field on tests of kurtosis represented by (3*1) we may^ consider as a field 
of tests of asymmetry: 

9(C) = l n {-r\x i -x\' + - xft I { * l'(Xi - x) 2 } ‘ C , (5-16) 


Biometrika 34 


x6 



232 Testing for normality 

where Z' extends to the observations x t less than the mean x and Z“ to the rest of the sample. 
For c = 3 the test is, of course, Jb v For normal samples 

E{g(c)Y = \x t -x \°+±r(x { -z)j k I El±£(x t -x)j ih ° , (5-17) 

the denominator of which is identical with the denominator of (4*1). Knowing the joint 
distribution (for normal samples) of (x x - x), (x 2 — x), ... (Geary, 1936), there is no theoretical 
difficulty in finding the mean values of the terms of the numerator for positive integer values 
of k . Here we shall be concerned only with the first and second moments, i.e. those for (5- 17) 
for k = 1 and k = 2. We require the normal distribution of z x = x x — x and the joint distribu- 
tion of z 1 and z 2 = x 2 — x. These are 


(*i)- 

(z v z 2 ): 


/ » > 

)* exp j 

\2nn — L 

j_( n_) 

7 f 

2n \n — 2) 

| exp| 


nz i j 


+ 2l) ZiZj 
’) (n — 2) 


dz 1 dz 2 =f(z 1 ,z 2 )dz 1 dz 2 . (6-18) 


Clearly the odd normal moments of g(c) are zero. Then 


1 


1 


n 


n 


E \-‘„ r \ X i-*\ C + l ^<*1 - " S E I =1 I* 0 + -b- E ^ **>’ 


n(n- 1) 


n* 


n‘ 


( 5 - 19 ) 


where E x (z x ,z 2 ) is the mean value of the two-dimensional terms. We then have 

/•o /•<) no n oo 

E 1 (z 1 ,z 2 )= (~z 1 ) c dz 1 I (-z 2 ) c dz 2 f(z v z 2 )~ I dz x (-z x f I dz 2 zP 2 f(z x , z 2 ) 

J — on J - 00 J — oo JO 

/*oo n o n oo n oo 

- dz 1 z c 1 dz 2 (-z 2 ) c f{z v z 2 )+ dz x zTi\ dz 2 z? 2 f(z x ,z 2 ) 

J 0 J -oo Jo Jo 

nao 

o 4%dz x dz 2 {f(-z x , -z 2 )-f(-z x ,z 2 )-f(z x , -z 2 )+f(z v z 2 )} 

/ n Wn — 2) c+1 /c \ 2 f (c + 2) 2 (c + 2) 2 (c + 4) 2 \ 
\n-2j («-l)®+ 2 V2 7 | 3!(»-i) 2 5!(»— l) 4 /’ 


1)C+ 2 ^2 7 

Ezf = — 


Also 


1^ 2w— l j c 1 

*{><-<-- erc4 


(5-20) 

(5-21) 

(5-22) 


We now have all the expressions required for the variance of normal g(c). We require, for 
what follows, only the term in n _1 which is 


(5-23) 



It 26 ( c 

\ 2 2 c+n 

I g 

II 

r> Is 

!7 n] 


Consider now a field of alternative universes represented by 


(6-24) 


the ‘first approximation to the law of error’ (for universal variance unity), obviously the 
most appropriate asymmetrical field, for different values of the parameter A a , and con- 



R. C. Gbary 


233 


taming as a member of the field the normal distribution found for A s = 0. For indefinitely 
large samples from (5'24) the mean value of g(c) is 

4 - <«•“> 

From (5-23) and (5-25) ^ = ^*r(c), (5-26) 

CT O 


the skew discriminant t(c) being given by 


r(c) = (c- 
Log-differentiating, 


"raieff-'r— »Kte-r- 


t'(c) 

1 

2 C+1 ( 1 


^2c+2*4+l\ 

T(C) 

c — 1 

2 \2c+ 1 

\ Ic+l 

n 4i f 


4l2 2 V, log 2 U 2^ 7^, )-i 
Z+i (2c + 1) 2 + 7 C+1 2c + if \2c+ 1 J c+1 f ’ 


(5-27) 


(5-28) 


Setting c = 3 and using (5*13) and (5*15), we find that r'( 3) = 0. Values of r(c) for four 


values of c are as follows: 


c t(c) c t(c) 


2 2-370 4 2-389 

3 2-450 5 2-236 


Accordingly, for indefinitely large samples the test of asymmetry y(c) is most efficient for 
r = 3, when the test becomes the familiar yjb v The margin in favour of this value of c, as 
compared with others in the range 2 ^ 5, is, however, quite small. 


6. Tests of kurtosis from the tower function viewpoint 

It may be useful to open this section with an interpretation of the results of the previous 
section from the point of view of the type of error theory of J. Neyman & E. S. Pearson 
(1933, 1936). For this we consider two universes of the field, the normal W Q and any non- 
normal universe W v and two tests of kurtosis a( 4) = b 2 and a(c x ) for a particular value c x of 
c. Suppose that samples are sufficiently large that a(c), for samples from all universes of the 
field, may be regarded as normally distributed. 

Given a probability a, a sample number n can be found so that the mean value of a(c t ) 
from W r lies exactly at, say, the upper a probability point of the distribution of a(c x ) from W 0 . 
Then from the results established in the preceding section the value of a( 4) for the same sample 
of n from W x could lie beyond the a probability point of o( 4) for normal samples of n. 
Suppose that the rule adopted was to regard as non-normal all samples for which a(c) 
lies beyond the normal a probability point, and suppose that a very large number N of 
samples were drawn, N 0 from universes not significantly different from normal (defining 
‘insignificance’ in some manner) and N x from non-normal universes, so that N = N 0 + N v 
where N 0 and N x are not necessarily known in advance. Then using a(c x ) the number of 
erroneous allocations will be approximately aN 0 +^N x , whereas using a(4) the number will 
be aV 0 + (£-p) N t (£>p>0), showing a definite advantage in favour of a(4). The same 
conclusion emerges whatever value of c + 4 or whatever non-normal universe be taken 
for comparison. 

The type of error approach reveals the theoretical weakness of using the method of § 5 
for the assessment of relative efficiency of tests of normality ; namely that the proportion of 



234 Testing for normality 

errors of judgment, even using a(4), remains large, due fundamentally to oonoentr&ting on a 
single value (the mean) as typical or representative of samples from the non-normal universe; 
it is also a disadvantage that the sample number n x is necessarily a function of the particular 
value c t of c. The method has further disadvantages of which the principal are perhaps (i) a 
somewhat restricted field of alternative universes; (ii) the assumption that the samples were 
indefinitely large, essential to justify the normality of a(c) for samples from any member of 
the universe field. 

The Neyman-Pearson power function approach which will now be considered cannot be 
regarded as entirely free from these objections in its application to the material so far 
available from this research. It enables us, at any rate, to contemplate samples which, if not 
small, are within the range of experimental practicability. 

The problem of the relative efficiency of the different members of a field of tests of kurtosis 
a(c) will now be considered in its power function aspects. For the present purpose the power 
may be defined as follows : 

Given a probability a (say 0*01), a sample number n , a particular value c x of c and a 
non -normal parent universe W v the power, in relation to these data, represents the frequency 
of a(c x ) for samples drawn at random from W 1 lying beyond the oc probability point for a(c x ) 
computed from samples drawn from a normal universe. The greater the power the more 
discriminatory the test. Accordingly, it is in theory necessary to know the frequency dis- 
tribution of a(c) for all sample sizes, for all values of c and for all universes. Considering that 
the only frequency distribution of the field contemplated which can be regarded as deter- 
mined for all sample sizes is a(l) for normal samples (Geary, 1935, 1936), many compromises 
are necessary to give any kind of practical effect to the power concept. The compromises 
proposed are as follows: 

(1) The form a ± (c) 9 given by (3-2), is used instead of the form o(c) given by (3-1). 

(2) Only large samples are dealt with. 

(3) The field of alternative universes is restricted. 

Using a x (c), the first four moments (from the origin) of a x (c) for samples from any universe 
can be expanded without real difficulty, and so approximate frequency distributions (using 
the Karl Pearson or Gram-Charlier systems) can be obtained. As to (1), from experiments 
in a(l) and a(4) the writer has verified that, for medium -sized normal samples, there is little 
difference between the probability points (e.g. 0*01, 0*05) of a x (c) and a(c), though the higher 
semi-invariants (given n) are larger for the latter. In regard to (2) and (3) little confidence 
could be reposed in the values of the moments computed from expansions even to n~ 3 unless 
the sample number was at least of the order of 100 when c is greater than, say, 3; and, even 
if the moments were known exactly, the empirical frequencies would be more than doubtful 
for small samples. The approach finds its main justification in the consideration that any 
errors due to these necessary compromises may be presumed to apply more or less equally 
and in the same direction to the tests of kurtosis compared; generous, perhaps too generous, 
advantage is taken of this justification in the concluding part of this section. 

Set, then, a^c) = [~r|^ | j j (6-1) 

so that = (l (i +^ z i) (6-2) 

« = / l \cM c > Vi = (| | c — /WK|> z i = 


where 


( 0 - 3 ) 



R. C. Geary 


235 


the universal mean being taken as zero, without loss of generality. Raising (6*2) to powers 
1, 2, 3, 4, expanding to the required degree the final factor, multiplying by the first factor 
on the right, and setting down the mean value of each term we find, to ra~ 3 , 

Mila « l- l -{W(U)-W(02)} + ± {^(12)-fci»[(03) + 3(ll)(02)] + 3i4»(02) 8 } 

71 71 

+ -1 {W(l 1) (02) - (13)] + *4 u [(04) - 3(02) 2 + 4(1 1) (03) + 6(12) (02)] 

ft 

-4 U [10(03) (02)+ 16(11) (02)*]+ 15*Sf>(02)}, (6*4) 

M'Ja 1 = 1 + - (*4«(02) - 2V*>( 1 1 ) + (20)} + - 2 { - *4®(03) + 3*i»(02) 2 

71 71 

+ 2*4 a >(12) - 0*4®(11) (02) — A4®(21) + *4®(20) (02) + 2*4®(11) 2 } 

+ - 3 {*4*>[(04) - 3(02) 2 ] - 10*4»(03) (02) + 15/fc<®(02) 3 - 2*4»[(13) -3(11) (02)] 

71 

+ 4if[2(l 1) (03) + 3(12) (02)] - 30fc ( 6 »(ll) (02) ? 

+ A4 a> [(22) - (20) (02)] - 4®[(20) (03) + 3(21) (02)] - 2*4®(1 1) 2 

- 0*4®(12)(11) + 12^»(ll) 2 (02) + 3i4®(20)(02) 2 }, (6*5) 

M'Ja* = 1 + i (A4®(02) - 3A4®( 1 1 ) + 3(20)} + i{- *4®(03) + 3*4®(02) 2 

+ 3*4 3) ( 1 2) - 9 Jfc®( 1 1 ) (02 ) - 3ifc§»(2 1 ) + 3*4®(20) (02) + Qkf\ 1 1 ) 2 
+ (30) - 3^(20) (1 1 )} + \ {A4®[(04) - 3(02 ) 2 ] - 10*4® (03) (02) + 15*4®(02) 3 

- 3^®[(13) - 3(1 1) (02)] + 6*4®[2(1 1 ) (03) + 3(12) (02)] - 45fc<®(l 1) (02) 2 
+ 3ifc? ) [(22) - (20) (02)] - 3*4®[(20) (03) + 3(21) (02)] + 9^ s >(20) (02) 2 

- 6*4®( 1 1 ) 2 - 1 8*4®( 1 2) ( 1 1 ) + 36*4®( 1 1 ) 2 (02) - fc ( »(31) 

+ *4®(30) (02) + 3fc®>(20) (11) 

+ 3*4»[(20) ( 1 2) + 2(21 ) ( 1 1 )] - 9*4®(20) (11) (02) - 6*4®( 1 1 ) 3 }. (6*6) 

M'Ja* = 1 + 1 (*4®(02) - 4 kff>( 11) + 6(20)} + \ { - *4®(03) + 3/fc<, 4 >(02) 2 

ft 71 

+ 4*4«(12) - 1 2*4®( 1 1 ) (02) - 6*4«(21) + 6^(20) (02) + 12*4®(1 1) 2 
+ 4(30) - 12*4«(20) (1 1) + 3(20) 2 } + -,{*^(04) - 3(02) 2 ] 

71 

- 10*4 4 >(03) (02) - 16*4«(02) 3 - 4*4®[(13) - 3(1 1) (02)] 

+ 8*4«[2(1 1) (03) + 3(12) (02)] - 60*f(ll) (02) 2 + 6*4®[(22) - (20) (02)] 

- 6*4®[(20) (03) + 3(21) (02)] + 18^®(20) (02) 2 - 12*4 4) (1 1) 2 

- 36ifc' 4 >(12) (11) + 72^®(11) 2 (02) - 4*4®(31) + 4*4®(30) (02) 

+ 12Af(20) ( 1 1 ) + 12*4«[(12) (20) + 2(21 ) ( 1 1)] 

- 36*4«(20) ( 1 1 ) (02) - 24*4«( 1 1 )» + (40) - 4*4®(30) ( 1 1 ) - 3(20) 2 

-6ifc<«(20)(21) + 34®(20) 2 (02)+ 12*4«(20) (ll) 2 }, (6*7) 

*£» = t ycGp c + l)(ipc + 2)...(j3?c + r-l )^ = Eyf ^ 


where 



236 Testing for normality 

the latter, of course, the same for all i. The (fg) required for the computation of (6*4)~(6*7) are 

( 11 ) ^ (^\-Mc\)lMc^ 

(02) = (fit-fiDI/ii 

(12) = (/^j4^i^2/^, 2 ^ C i// 2 ~^, C |/^4 + 2/^, c ,/i|)//^, d /^|, 

(03) = (/* 6 ~3// 4 /* 2 + 2/4) 1/4, 

(04) = (/ 1 8 - 4 /itju 2 + 0/* 4 /*l - 3 

(13) = [/^ je 4 HJ |~3/^ j 44 ^ j /^ 2 + 3/^, 2+ C |^|--/^ lc ,(/^ fl *~3// 4 /^ 2 + 3/4|)]// t jc ,/4, (6*8) 

(21) « [/^jae^^i ^/^ic-t-ai /^,ci /^a(/^iaci 

(22) = (/4|2c+4I ““ Zf*\2e+2\Mi "h/^iacl/ 4 ! “ 2/Vf4l/*|cl + ^lc+21 /^|c|/^2 “ + / y ?cl/ / 4)//^ic|/ y i> 

(20) = (/^iacj ^u?i)//^ici> 

(30) = (/^isci ~ 3/^iaci/^ici 4- 2/eJ.|)//«J.|, 

(31 ) = (/^|3 C 4-2! 3/*|2<*+2|/ y '|c| + ty\c+2\t l fc\~ / l \2c\/*2 + ^/ l \'lt\/ l \c\/ f 2 ~~ ^/tfc\f l 2)l t l fc\/ l 2' 

(40) = (/^ j4 ^ { — 4// (3cI // 1c | -f 0/f (2fi //f rl — 3// J,| )//^j. 

(6-8) is, of course, an immediate consequence of (6-3). The writer has checked the accuracy 
of formulae (fi*4)-(6*7) by reference to the normal universe for c — 1 . 

The reader will Have no illusions as to the magnitude of the task of applying the foregoing 
theory to particular cases. The formulae arc set down, however, in the hope that other 
researchers will be sufficiently sensible of the importance of the theory to assist in building 
up a fairly extensive set of results. The writer has to be content, in the meantime, to consider 
the case of the symmetrical universe field given by 



when A 4 = the normal being given, of course, for A 4 = 0, and for r = 4 and c = 1. These 
values of c are selected because the theory in § 5 has suggested that a(4) is probably the most 
efficient of the test-field a(c), while a( 1) is the only member of the field for which the normal 


Table 8, Moments from, formulae (6-8) 


(fg) 


c = 4 

C=1 

Normal 

a 4 =i 

Normal 

A 4 = i 

(ii) 

4 

5*42857 1 

1 

1*17021276 

(02) 

2 

2*5 

2 

2*5 

(12) 

24 

45*04286 

3 

* 4*88297871 

(03) 

8 

14 

8 

14 

(04) 

00 

138 

00 

138 

(13) 

216 

544*2857 

21 

44*100383 

(21) 

260/3 

177*71428 

1*141593 

1*75544898 

(22) 

2,720/3 

2,481*92857 

7*707963 

14*760814 

(20) 

32/3 

16*142857 

0*570790 

0*63834981 

(30) 

352 

799*142857 

0*429204 

0*0405182 

(31) 

4,352 

12,785*2653 

3 

5*236134 

(40) 

23,552 

73,250*178 

— 

2*002492 




R. C. Geary 


237 


distribution is known for samples of all sizes. The necessary moments (fg) given by (6*8) 
are shown in Table 8. Based on the values in this table, moments (M') given by (6-4)-(6-7) 
of o 1 (c) and semi-invariants (L) derived therefrom are as follows. The normal values are, 
of course, known exactly but were computed for the purpose of checking the formulae: 

c = 4; normal universe 

jL 1 = Jbri 2 _4 __8 

3 — 3 “ n + n* n 3 ’ 

Jfl 4 28 1040 L a _ 8 40 1136 

9 “ 3 n n 3 + 3 n 3 ’ 9 ~3 n"n 3 + 3n 3 ’ 

M' a 2 48 1040 L a _ 64 2368 

27 “ + n n 3 n 3 ’ 27 ~n 3 n 3 ’ 

ifi 8 40 3620 L t _ 3840 

81 “ + « + 3» 3 » 3 ’ 81“ n 3 * 


c — 4; universal A 4 = £ 

3-357 11-822 12-1 

3-5 3-5 «. + » 2 n 3 

M' 2 _ _ 2-286 _ 57-34 776-03 L a _ 4-4286 92-25 831-2 
(3-6)* n n 3 + n 3 ’ (3-5) 2 n w 2 w 3 ’ 

M' a _ 3-215 107-47 2853-89 L a 144-61 6193-95 

(3-5) 8 “ + n n 3 n 3 (3-5) 3 “ « 2 “ ' n 3 ’ 

^ 13-143 20-49 9529 _L± _ 10,587 

(3-5) 4 ~ + “ « + w 2 n 3 ’ (3-5) 4 “ n 3 


r — 1 ; normal universe 

r „„ , 0-19947114 0-02493389 0-03116737 

L = M ^0-7978845608 + - + _ * , 

1 n nr n 3 

, 0-04507034 0-07957747 0-03978874 


r 0-01685645 0-07613597 

a 3 --- * — + -- -s — • 

n 3 n 3 


c = 1 ; universal = 1 

L x 0-35239362 0-159616 0-74 5838 

/‘lit ~>w ~ ~~ n . n 2 w 3 ’ 

Jlf' 0-79792429 0-458012 1-8 00648 iD a _ 0-093 1370 5 0-262961 0-196477 

A® i ~ n n 3 n 3 ftf Vi n n 3 n 3 ’ 

M' a 1-336592 0-850081 3-239 101 ^3^0-053356 0-204164 

/ifn “ n n 3 n 3 // 3 | n 2 + n 3 ’ 

/t m = 0-78126197. 

Two sample sizes were considered: n = 100 and n = 500. For n = 100 and c = 4, the 



238 


Testing for normality 


following are the Pearson Type IV frequencies of a x (4) when the parent universes are normal 
and have A 4 = /? a — 3 = | respectively: 

Normal: A 4 = 0. ac cos 118350 6 e 13 * 01543 * dx 9 ] 


tan 6 = (x— l-873387)/0-765849, 
logic ac = 3-2644596. 


( 6 - 10 ) 


A 4 - i : ac cos 6 00 *® 0 e 2 ' 3128 * d#, 

tan <9 = (a: — 2-8522)/0-9062, 


( 6 - 11 ) 


logic AC * 1-7499974. J 

The normal probability points show n in column (2) of Table 10 were derived from the fore- 
going normal frequency (6-10); the points in column (3) were derived from a Gram-Charlier 
formula (Geary, 1935). The 0-01 and 0-05 points given in column (2) are practically identical 
writh those given by E. S. Pearson (1929) for a(4), namely, 4-39 and 3*77. The powers given in 
column (4) are the aggregate frequencies lying beyond the values of the variate shown in 
column (2) on the assumption that the actual frequency was (611). The corresponding 
figures for c = 1 given in column (5) were based on a Gram-Charlier formula. 


Table 9. Power of a x (c) for c = 4 and e = 1 of discriminating (6*9) for A 4 = i from 
the normal (A 4 — 0) at four normal theory 'probability levels. Samples of 100 



; 

Normal theory probability points j 

Power for frequency (6-9) with A 4 = J 

Normal theory 


1 




| 

i 


probability 

c = 4 

c = 1 

i 



(upper) 

(lower) 

c — 4 | 

j 

r~ 1 

(1) 

(2) 

(3) 

(4) 

(8) 

001 

4-3836 

0-7482 

0-0648 

0-0695 

006 

3-7744 

0-7642 

0-1995 

0-1979 j 

0-10 

3-5195 

0-7726 

0-3163 

0-3037 

0-20 

3-3110 

0-7824 

i 

0-4525 1 

1 

0-4597 j 


Before discussing the comparative powers in Table 9 it will be convenient to give a 
table, 1 1, on the same lines but for n = 500. On account of the larger sample size it has been 
necessary to change the reference-probabilities given in column (1). For the construction 
of this table Gram-Charlier formulae were used throughout — the probability points being 
determined from the E. A. Cornish <fe R. A. Fisher (1937) formulae — after verifying that 
for two of the probability levels, 0-01 and 0-05, the probability points for c = 4 (column (2) 
above) did not differ appreciably from those given by E. *S. Pearson, namely, 3*60 and 3-37 
(for a( 4)), based on a Type IV curve. 

The analysis in § 5 has enabled us to come fairly firmly to the conclusion that for indefinitely 
large samples a(4) was to be preferred to a(l ) as a test of normality. We see from Tables 9 
and 10 that this is subject to an important qualification. Table 9 shows that the discrim- 
inating power is definitely greater for samples of 500 for a( 4) than for a( 1 ), but the superiority 
is less emphatic than might have been anticipated from §5. For medium-sized samples 
(Table 9) a(4) exhibits no superiority. Of course, these conclusions are very tentative, 
as being based upon a single alternative and on particular sample sizes. The writer had 
proposed, in addition, to examine the universes (i) A 3 = 0, A 4 = 1 and (ii) A£ = A 4 = £ as 
alternatives to the normal but time did not permit; he ventures to repeat the hope that other 
students will take the matter up. 



R. C. Geary 


239 


Table 10. Power of a x (c) for c « 4 and c = 1 of discriminating (6*9) for A 4 = | 
/row normal (A 4 = 0) at four probability levels. Samples of 600 


Normal 

probability 

Normal probability points 

Power for frequency (0*9) with A 4 = I 

c = 4 
(upper) 

c= 1 
(lower) 

1! 

o 

c= 1 

(i) 

(2) 

(3) 

(4) 

(6) 

0-005 

3-7002 

0-773107 

0-1934 

0-2007 

0-01 

3*0094 

0-775684 

0-2920 

0-2790 

0-05 

3-3700 

0-782482 

0-5965 

0-5190 

0-10 

3-2095 

0-786058 

0-7392 

0-0509 


7. Conclusion and summary 

In § 2 of the present paper it is shown that the actual probability of differences between 
means and variances derived from random samples on the nul -hypothesis may differ 
considerably from the probability derived from the standard tables (compiled on the 
assumption that the universal distribution is normal), when, in fact, the universal distribu- 
tion is not normal. Accordingly, the standard tables cannot validly be used unless tests, 
based on the sample from which the inferences are to be drawn, or on a series of samples 
produced under similar conditions, have established the likelihood that the universal 
distribution is approximately normal. In certain cases — but these must be few — the nature 
of the material may, of itself, suffice to justify the assumption of universal normality. 
When universal normality cannot be assumed, the best course will be to correct the standard 
tables using, for this purpose, the moments (up to, say, the fourth) derived from the sample, 
in conjunction with the formulae given in §2. This procedure is, of course, open to the objection 
that the moments derived from the sample may, in fact, differ substantially from the (in 
general unknown) universal moments, so that any probabilistic inference derived using 
sample moments must be accepted with reserve. If b 2 = 3*5, say, it would be safer to assume 
that the universal value ft 2 is 3*6, than to hope (without other evidence) that it is 3, the 
normal value; it might be 3*76 or even 4, when, usually, the standard table probabilities 
will be still further astray. It should not be difficult to construct supplementary tables 
giving very approximate corrections of the standard tables, using the moment expansions 
given in § 2, for different values of and /? 2 . To compute unbiassed estimates of the latter, 
R. A. Fisher’s k statistics (1929) should, of course, be used. 

It may be asked if testing for normality and, when necessary, correction for universal 
non-normality is worth the trouble. To answer this question it is desirable to have regard to 
the logical position of the statistician, concerned with drawing inferences from samples, 
whose characteristic approach may be defined as reductio ad paene absurdum : if an event is 
highly improbable it must be regarded for practical purposes as impossible. St Thomas 
Aquinas’s* famous ‘certitude of probability’ is peculiarly apt as applied to the mental 
attitude of the statistician, from two quite different viewpoints. The first is that decision, 
and action based on that decision, for which there is not certainty, but merely probabilistic 
preference, is absolute. One does not say that one has a preference of 20 to 1 for Fertilizer A 

♦ ‘According to the Philosopher, certitude is not to be sought equally in every matter.. . .Hence 
the certitude of probability suffices, such as may reach the truth in the greater number of cases, although 
it fails in the minority* (Summa lla-llae q. Ixx, a. 2). 



240 Testing for norinality 

over Fertilizer B because the differences between the yields is at or near the 5 % probability 
point of some test functions: one necessarily decides without qualification that A is better 
than B. 

The second aspect, which has the greater relevance in the present case, is that the statis- 
tician regards himself as endowed with ‘ certitude ’ when he knows that if he repeated an 
experiment, as to, say, significant differences in averages, a great number of times, he would 
be in error in attributing significant difference when, in fact, there was none, in a predeter- 
mined proportion of cases. He has certitude as to the probability though his decision in the 
individual case may be wrong. What is curious is that decisions (which, in effect, are absolute) 
can be based on probability levels which vary with the temperament of the statistician from 
perhaps a conservative 0-001 to a daring 0-1. For the particular statistician the probability 
level will vary with the case: for instance, the present writer would be inclined to suspect 
non-normality near the 10 % probability level of the a(l) table, whereas he would not be 
disposed to attach significance in, say, analysis of variance, until about the % level. 
Naturally the level will depend on the importance attaching to the decision. 

Since all the statistician usually requires from the table of probability for a given measure 
of significance is whether, on the nul-hypothesis, the probability is ‘small', absolute 
precision is not necessary in the probability. If the probability is thought to be minute, say 
0-001 , it does not matter if in actual fact it is 0-002 or 0-0005. If, on the contrary, the standard 
table value is approaching the statistician’s level of decision it surely matters a great deal: 
if he thinks his judgment is likely to be erroneous in 1 out of 20 experiments it must be of 
importance if, in fact, the true probability is something like 1 in 10 or 1 in 5. These are the 
kinds of contrasts that appear from §2, from comparison of standard table probabilities 
with ‘actual’ probabilities found when the samples were assumed to be randomly drawn 
from certain arbitrarily selected types of non-normal universes. The computed probabilities 
in § 2 admittedly make no claim to exactitude in most of the cases, since the formulae were 
strained by their application to small sample theory. The point is, however, that the estimates 
of the actual probabilities are unbiassed in regard to the ‘normal theory’ probabilities: 
if the former could be closer to the latter, they might also be further away. 

There is one case which is in a quite exceptional category, namely that considered at the 
beginning of § 2. As far as the writer is aware, this case has never been examined theoretically 
before, despite the extreme simplicity of the algebra. It is shown that in the simplest case 
of analysis of variance, when the two sample numbers are of the same order of magnitude, 
the variance is proportional, approximately, to (/? 2 - 1), so that quite a small measure of 
universal kurtosis materially changes the probability. Statisticians must have been affected 
by a kind of hypnosis in favour of normal theory to have overlooked so trivial a point, 
a stricture from which the writer is not particularly concerned to exclude himself! An 
exception was E. S. Pearson (1931) who, on the basis of his results cited in § 2 (a), sounded 
a warning: ‘The illustration should serve to emphasize the fact that certain of the “normal 
theory” tests can be used with greater confidence than others when dealing with samples 
from populations whose distribution laws are not known.’ 

An interesting chapter could be written on the fluctuations in the attitude of statisticians 
during the past century on the question of the occurrence of the normal frequency distribu- 
tion in nature, a chapter, perhaps, in a large work on Fashions in the Scienoes down the Ages. 
Amongst the following the historian may find the reasons for the prejudice in favour of the 
hypothesis of universal normality up to, say, the end of the last century: 



R. C. Geary 241 

(1) The fact that, to a close approximation, it applies in a wide range of mathematical 
conditions. 

(2) The fact that the theory found practical applications predominantly in assessing the 
probability of errors in astronomical measurements and in games of chance where the 
mathematical model could reasonably be assumed to apply. 

(3) The beauty of the mathematical theory and the facility of algebraic manipulation in 
the function involved. 

(4) The general shape to the visual sense of such frequency distributions as were known, 
before x 2 imposed its discipline. 

With the development, about the beginning of the century, of the theory of moments, 
statisticians became almost over-conscious of universal non-normality. The conoomitant 
semi-invariant approach had quite a different background. The difference between the 
moment and Karl Pearson curve system on the one hand and semi-invariants and the Gram- 
Charlier system on the other is fundamentally that for the former normality is a particular 
case like any other, whereas for the latter normality is basic and generative. Each system 
has its advantages and disadvantages as applied to the determination of frequency dis- 
tributions of which the lower moments are known. In fanciful terras one might say that in 
the ship Gram-Charlier one might sail in perfect safety but only within limited, and more 
or less ascertainable, range of Port Normality, whereas in the good craft Pearson one can 
sail the seven seas — at one’s own risk.* 

Our historian will find a significant change of attitude about a quarter-century ago following 
on the brilliant work of R. A. Fisher who showed that, when universal normality could be 
assumed, inferences of the widest practical usefulness could be drawn from samples of any 
size. Prejudice in favour of normality returned in full force and interest in non -normality 
receded to the background (though one of the finest contributions to non-normal theory 
was made during the period by R. A. Fisher himself), and the importance of the underlying 
assumptions was almost forgotten. Even the few workers in the field (amongst them the 
present writer) seemed concerned to show that ‘universal non-normality doesn't matter’: 
we so wanted to find the theory as good as it was beautiful. References (when there were 
any at ail) in the text-books to the basic assumptions were perfunctory in the extreme. 
Amends might be made in the interest of the new generation of students by printing in 
leaded type in future editions of existing text-books and in all new text-books: 

Normality is a myth ; there never was, and never will be, a normal distribution. 

This is an over-statement from the practical point of view, but it represents a safer initial 
mental attitude than any in fashion during the past two decades. 

As already indicated, the present work is incomplete, especially on the experimental side. 
The writer hopes that he has created a prima facie case for the importance of testing for 
normality. 

Summary 

(i) Inferences drawn from the standard (normal) tables of z and t may be seriously in 
error if the conditions in which the standard tables apply (the principal of wliich is that the 
universes from which the samples are drawn are normal) are ignored. 

* This comment must not be taken as applying to the problem of curve-fitting, i.e. to fitting a smooth 
curve to given frequencies, but to the problem of estimating the frequency function given the first 
few semi -invariants. 



242 Testing for normality 

(ii) Sufficient conditions are given for the approach to normality^ with increasing sample 
size, of the field of tests of normality a(c) (given by (3*1)) for c > 0. 

(iii) Many term expansions of the first four moments of a(c) for normal samples are given 
with practical applications designed to find the* values of c for which the moments could 
be used with confidence to find the frequency distributions for medium -size samples; semi- 
invariants of a 1 (2-4) and aj(4) (a x (c) is given by (3*2)) are compared; correlations between 
a x (c) and a^c') are examined. 

(iv) For indefinitely large samples and a wide field of alternative universes a(4) is found 
to be the most sensitive test of kurtosis and an analogous test of asymmetry g(c) is found to 
be most sensitive for c = 3, </(3) being the familiar ^b v 

(v) An examination of the relative efficiency of a(l) and a(4) from the Power Function 
point of view suggests that a(4) is increasingly to be preferred as the sample size increases; 
for samples of moderate size a(l) is probably as efficient as a(4). 

(vi) Throughout the paper a considerable range of formulae is given in case students may 
feel interested to carry the writer’s researches a stage further so as to give a firmer basis to 
his conclusions or to modify them. It is suggested (§4) that the preparation of a table of 
probability points of a(2-2) for normal samples of different sizes be taken in hand. 

REFERENCES 

Baker, G. A. (1932). Ann. Math. Statist. 3, 1. 

Bartlett, M. S. (1935). Proc. Camb. Phil. Soc. 31, 226. 

Cornish, E. A. & Fisher, R. A. (1937). Rev. Inst . Int. Statist. 5, 307. 

Craig, C. C. (1928). Metron , 7, 3. 

Eden T. & Yates, F. (1933). J. Agric . Sci. 23, 6. 

Fisher, R. A. (1925). Metron , 5, 90. 

Fisher, R. A. (1929). Proc. Lond. Math. Soc. (2), 30, 199. 

Fisher, R. A. (1930). Proc. Roy. Soc. A, 130, 16. 

Fisher, R. A. & Wishart, J. (1931). Proc. Lond. Math. Soc. (2), 33, 195. 

FrAchet, M. (1937). Gineralitis sur les ProbabiliUs. Variables aUatoires. 

Geary, R. C. (1933). Biometrika , 25, 184. 

Geary, R. C. (1935). Biometrika , 27, 310, 353. 

Geary, R. C. (1936). J. Roy. Statist. Soc. ( Supplement ), 3, 178. 

Geary, R, C. (1936), Biometrika , 28, 295. 

Geary, R. C. (1947). Biometrika, 34, 68. 

Geary, R. C. & Pearson, E. S. (1938). Te*ts of Normality. 

Geary, R. C. & Woblledge, J. P. G. (1946). Biometrika , 34, 98. 

Gosset, W. S. (1908). Biometrika , 6, 1. 

Hsu, C. T. & Lawley, D. N. (1940). Biometrika , 31, 238. 

Kendall, M. G. (1941). Biometrika , 32, 81. 

Nair, A. N. K. (1942). Sankhyd , 5, 393. 

Neyman, J. <fe Pearson, E. S. (1933). Philos. Trans. A, 231, 289. 

Neyman, J. & Pearson, E. S. (1936). Statist. Res. Mem. 1, 1. 

Pearson, E. S. (1929). Biometrika , 21, 337. 

Pearson, E. S. (1930). Biometrika , 22, 239. 

Pearson, E. S. (1931a). Biometrika , 22, 423. 

Pearson, E. S. (19316). Biometrika , 23, 114. 

Pearson, E. S. (1935). Biometrika , 27, 333. ' 

Pearson, E. S. & Adyanthaya, N. K. (1929). Biometrika , 21, 259. 

Pearson, Karl (1895). Philos . Trans. A, 186, 343. 

Pepper, Joseph (1932). Biometrika , 24, 55. 

Rider, P. R. (1931). Ann. Math. Statist. 2, 48. 

Rietz, H. L. (1939). Ann. Math. Statist. 10, 265. 

Shewhart, W. A. & Winters, F. W. (J928). J. Amer. Statist. Ass. 23, 144. 

Wishart, J. (1930). Biometrika , 22, 224. 

Yasukawa, K. (1934). Tokohu Math. J. 38, 465. 



[ 243 ] 


THE STRATIFIED SEMI-ST AT J ON ARY POPULATION 

By 8. VAJDA 


1. Constant population 


Let a set of non-inoreasing real values p 0 = 1, p v ■■■,p H ,p ll +i = 0 be given, and let p t 
represent the probability of a person of age 0 surviving the i following years. Further, let 
l 0 , l v represent the numbers of persons of age 0, 1 , ...,n living at time t = 0. We con- 

sider then the development of such a population during the years following t — 0, under the 
assumption that the probabilities p t remain the same throughout the period investigated. 

Only persons of age 0 are to enter the population, and the number of such entrants shall 


n 

be such that the total of the population is kept constant at a number H = At the end 

i— 0 


of the first year the survivors of the H persons who were alive at t = 0 will be (if we put 


hi Pi = U, say) 


n ~l . n-1 

2 L = Zr iPi+l <H, 

Pi i = 0 


i=n 


and therefore the number of entrants at the beginning of the second year (i.e. at t = 1) is 

n- 1 

& - H- £ r i Pi+v 

t-0 

By the same argument the entrants at i - 2 will be 


and so on; generally 


n~2 


<t>% = ff-^iPi-huPt+i’ 

i-0 


fit ~ H — (fii^iPi ~ <fit-2p2 ~ — <filPt~ l 


n- t 


~Y* r iPun 

i“ 0 


(i) 


as long as t^n. that is, as long as there are survivors of the initial population. For t>n 
we obtain H = <f>,+<f> l _ 1 p 1 +<t> l _ 2 p t + ...+<f>,_ n p n . (2) 

We want to find an expression for <f > ( , which must obviously depend on l l9 1 2 , ..., l n . Now 
(2) is a difference equation for the function <f> t of t and can easily be solved. For this purpose 
consider the ‘ characteristic equation ’ 

x n -f x n ^ l p t 4* x n ~~ 2 p 2 4- ... 4- xpn-i 4 p n = o. (3) 

Let this equation have the roots x v x 2 , .... .r r , where .r, is a A r fold root and x t - 4= We have 
then as a solution of the difference equation (2) 

fii~H\ 4 P x (t) x[ 4- . . . 4- P r (t) x\ , (4) 

where H t = HjEp t and P { (t) — a a 4 oc i2 t + ... + 0 L iki t k *- 1 . The <z fj must be found from the 
initial population, i.e. from equations of the form (1) which contain the first n numbers of 
entrants <f> v <j > 2 , . . . , <j> n . But we find by inspecting these equations, which are of the form (t ^ n) 

H = + + 4>ip t -i + r 0 p, + r lPi+1 + . . . + r n _,p n . 



244 The stratified semi-stationary population 

that they are equivalent to 

r t = <f>_ t = H 1+ i (5) 

Hence the a t y can be fixed, dependent on the r { = l i /p i and thus on the initial population. 
We have thus proved: 

If a population with an age distribution l 0 , l v ..., l n is subject to survival rates p { (i = 1,2, n), 
and if this population is kept constant by <f> { entrants of age 0 at the end of the tth year , then <f> t 
is given by (4), where the x i are the different roots of (3), and the must be found from the set (5). 

The population after t years will have the following age distribution: 

00 <Pt-zPi> •••' 0/-»P.r 

It can easily be proved that, if the p i are decreasing (and not merely non-increasing), then 
for all the roots x t of equation (3) we have | x t | < 1 and that any real root must be negative. 
Hence the <f> t will oscillate around their limit lim^, = H v The age distribution of the popula- 

t~ CO 

tion thus tends, again through oscillations, to H v H x p v ..., H^p u , which may be called the 
intrinsic stationary population. Obviously, if the initial population has already this dis- 
tribution, it will not alter any more and the number of entrants will be constant and = H v 
In such a case all 0 L ti = 0, and r { = H r = Z 0 , whatever the x L may be. 

On the other hand, if p i — p u i holds for one or more values of /, then we may get cycles, 
and this is easily seen for the equation x n +x"~ 1 + ... + #4 1 =0. All roots have modulus J, 
and it depends on the initial population whether we are dealing with the stationary case or 
with periodic cycles. No tendency towards an intrinsic stationary population appears in 
such a case. 

Example. Let us assume that we have the following probabilities of survival: 

Pi P2 V* Pa Ph Pa 

7/8 49/96 5/32 13/384 1 /3S4 0 

The characteristic equation can then be written 

384a: 5 -f 336a: 4 + 1 96a: 3 + 60a: 2 + 13a; 4 1 = 0, 

which has the five different roots 


-i - l±\j- and 

The initial population will be assumed to be 



k k 

k k 

1 4 

k 


859 1269 

229 50 

115 

56 

which implies H l 

= 1000 (approx.). Therefore the r i = 

klPi are 


r o 

! r< 1 

r 2 I r 3 

U 

! 

859 

| 1450-38 j 

448-86 ] 319-14 

3405-42 | 21430-90 

From r k — 1000 + a l xf k + a 2 x^ k + 

... + a 6 a£* (X- = 1,2, 

..., 5) we find 



a 2 I a 3 

| ^4 

«s 


— 70 + 60i 

| — 70 — 60)' 0 

i 0 

- 1 


It follows that the number of entrants in the year t ( = 0, 1 , 2 , ... ) will be 

= 1000 [l 4- ( - 0-07 + 0-06i) ( - i + V- A)‘ + ( - 0-07 - 0-06i) ( - i - v - A) + • 



S. Vajda 245 

These numbers are given in the first row of Table 1, which shows the evolution of the whole 
population. 

Table 1 


t 

0 

1 

2 

3 

4 

5 

6 

7 

8 and after 

Age 0 

859 

996 

1025 

989 

1001 

1001 

999 

1000 

1000 

1 

1269 

752 

872 

897 

865 

876 

876 

874 

875 ( = 1000x7/8) 

2 

229 

740 

438 

508 

523 

505 

511 

511 

510 (=1000 x 49/96) 

3 

50 

70 

227 

134 

156 

160 

155 

156 

156 ( = 1000x5/32) 

4 

115 

11 

15 

49 

29 

34 

34 

34 

34 (= 1000 x 13/384) 

5 

56 

9 

1 

1 

4 

2 

3 

3 

3 ( = 1000x 1/384) 


2578 

2578 

2578 

2578 

2578 

2578 

2578 

2578 

2578 


2. Two CONSTANT POPULATIONS 

All this covers well-known ground.* A new problem arises, however, when we consider two 
initial populations with two sets of probabilities of survival, say p t (i = 1,2, ...,%) and 
Pi ( i = 1,2, ...,n 2 ), wherep 0 = p 0 = 1 and + We ask now whether it is possible to 

keep both constant by the same number of yearly entrants. More precisely: 

Let the two equations 

= 0 and 1 = = 0 

i'^o y-»o 

have the roots x v ... 9 x r with multiplicities k v ..., & r and y x , ...,y 8 with multiplicities^, j H 
respectively. No two x t or two y t are equal and no x i or y t is zero. Under what further con- 
ditions, concerning the oj’s and the y 7 &, can the expressions <j> t and i// t then have the same 
numerical values for all integral values of t , i.e. 

+ = 0 for < = 0,1,2,..., (6) 

1 1 
r 

where <f> t = H x + £ /*(<) x\ with P t (t) =- a tl + oc i2 t 4- . . . 4- 1 

i*i 

and 'Jr l = B 1 + i l P,(t)y‘ i with P f {t) = fi n + fi a t + ... 

Suppose first that none of the .r’s equals any of the y' s. Then it is known that the deter- 
minant of any set of equations of the system (6) is not zero. It follows that we must have 
H x = H x and all a’s and /?’ s = 0, hence all P f (t) and P^t) = 0. In this case the two populations 
must already be stationary and therefore identical with the intrinsic stationary populations 
which are implied by the sets p t and p i9 respectively. 

On the other hand, if some of the x’s are equal to some of the y’ s, say x x = • • •, x m = y m 

and all the others are different, then we find by the same argument that H x — H x and Pi = P 4 
for the first m values of i , whereas all the other P t and Pi are identically zero. (It is, of course, 
again possible that all the P { and P { are identically zero and that we have, in fact, again the 
two intrinsic stationary populations.) 

If all x’b are equal to the if s, with equal multiplicities, then the two equations are equal 
* It follows, for example, from results of P. H. Leslie (1945). 




246 The stratified semi-stationary pQpulation 

and the two populations must be identical, if they are to be kept constant by equal numbers 
of entrants. 

We have thus reached the following conclusion : If we assume that the two equations given 
above are not identical, and that the initial populations are not the intrinsic stationary ones, 
then they must be such that the two equations have some (but not all) roots equal and if we 
calculate the corresponding P t and P { (see (4) and (5) of the previous section), then those 
corresponding to the equal roots must be identical and the others must vanish. This inoludes 
the case where the x’s and y’ s are the same, but with different multiplicities, so that the 
Pf and P { do not all extend to the highest power of t which would be admissible by (7) or (8) 
respectively. 

If the two populations are the intrinsic stationary ones, then the numbers of entrants will 
be constant (i.e. independent of the year) and the two constants will be equal if and only if 

SL Zli 

where the first expression refers to the first and the second expression to the second 
population. 

Example. In the example used in § 1 we have a 3 = a 4 = 0, and we can therefore try to 
obtain a second population which is kept constant by the same numbers of entrants as the 
first one. We construct an equation which has again the roots - J + and — |, but not 
— i ± V~ eV Such an equation is, for example, 

480x r ‘ + 396.P 4 + 218a: 3 + 62.r 2 + 1 + 1 = 0, 

which has the roots - j ± v '~ and - rs ± *J- It implies that the probabilities 

of survival, i.e. p t , are 

33/40 109/240 31/240 13/480 1/480, 

and as the <j> t (and the r, = <}>_ t ) are to be the same as in § 1 , the initial population must now be 
h — r iVi- Table 2 shows the development of such a population, and it will be seen that the 
first line is identical with that in Table 1. 


Table 2 


t 

0 

1 

2 

3 

4 

5 

6 

7 

r 

8 and after 

Age 0 

859 

996 

1025 

989 

1001 

1001 

999 

1000 

1000 

1 

1196 

709 

822 

845 

816 

826 

826 

824 

825 (= 1000x33/40) 

2 

204 

658 

390 

452 

465 

449 

455 

455 

•454 ( = 1000 x 109/240) 

3 

41 

58 

187 

111 

129 

132 

128 

129 

129 (=1000x31/240) 

4 

92 

9 

12 

39 

23 

27 

27 

27 

27 ( = 1000 x 13/480) 

5 

45 

7 

1 

1 

3 

2 | 

2 

2 

2 ( = 1000 x 1/480) 

f 

2437 | 

i 

2437 

2437 

2437 

2437 



2437 

2437 

2437 

2437 



S. Vajda 


247 


3. Stratified population: two grades 

The results of the previous sections will now be used for an investigation of the stratified 
population. * First, we consider a population split into a lower and a higher grade in the 
following way: 

We assume that all members of age 0 are in the lower grade only, but that all other ages 
may share in both grades. Apart from mortality, which operates on all members according 
to their age, we assume that at every age a certain proportion dependent on that age is 
‘promoted’, at the end of the year, from the lower into the higher grade. Our problem is 
to discover whether this can be done whilst maintaining the totals in both grades constant; 
naturally the grand total of the population must remain constant. 

It is sufficient to deal only with the lower grade, as the numbers at each age in the higher 
one can be found by subtracting those in the lower grade from the total population at that 
age. Now the lower grade is depleted by mortality and also by promotions. If the probability 
of remaining unpromoted until age i is t { , then the probability of not leaving the grade in 
this period is p t £ y = p i , say. Since all entrants into the population are at the same time 
entrants into the lower grade, our problem thus reduces to the following: 

Tn it possible to find an initial population, stratified into two grades, such that, on the 
basis of mortality described by p in the number of entrants every year necessary to keep the 
population constant is the same as that calculated on the basis of mortality-cum-promotion, 
described by p i i 

We can apply our results in § 2 to this case by considering the lower grade and the total 
population as the two populations given. It follows that the lower grade can only be kept 
constant by that number of entrants which is necessary for the total population, if the latter 
is initially such that some of the /*(/) which depend on it are either identically zero or at 
least do not extend to the highest degree indicated by the multiplicities of the corresponding 
roots in £p i x n ~ i — 0. In order to find a suitable initial population for the lower grade it is 
then necessary to find an equation * = 0 which has the roots, with the necessary 

multiplicities, which appear explicitly in <f> t as calculated from the original equation, but 
which is not identical with it. The degree of i^t/ 71 ” 7 = 0 may be lower than or equal to that 
of ZpiX n ' = 0. If it is lower, then all members of the population will be in the higher grade 
at the highest age or ages. 

This condition is not sufficient, however. In view of the interpretation of the equation 
containing the p/s these coefficients must be positive and, as the lower grade is a part of the 
whole, we must have p t ^ p ( for all i. But it is not necessary that we have also p$+i^2V 
If the opposite holds, this could still bear a practical interpretation. It would mean that 
reversions occur from the higher into the lower grade. 

If an equation with the necessary and sufficient properties can be found, then we take the 
r { — l i jp i w r hich we had to start with and construct the initial population of the lower grade 
by writing the number at age i as l ( = = lipJPi. 

It will be seen that in such a population the age distributions change with the passage of 
time (tending tn a stationary limit) but that nevertheless all entrants have the same com- 
bined prospects of survival and promotion. (Thus from the point of view of a member of 
the community his position is the same as if he entered a stationary population. His chances 

* Cf., for the stationary ease, with continuous changes, H. L. Heal (1946). 


Biometrika 34 


17 



248 The stratified semi-stationary population 

of promotion are unaffected by the changes in the age distribution of those in front of him. 
But the characteristics of the population as a whole, for instance the efficiency of the staff 
from the point of view of an employer may, of course, vary considerably.) Suoh a population 
will be called semi-stationary. 

Example. The population shown in Table 2 can be taken as representing a lower grade 
within the population given in Table 1. The ratios t { = pJPi are then: 

t Q ti ^2 ^3 ^4 ^5 

1 33/35 218/245 62/75 4/5 4/5 

Table 3 is constructed by subtracting Table 2 from Table 1 and thus shows the com- 
position of the higher grade. 

Table 3 


t 

0 

1 

2 

3 

4 

5 

6 

7 

8 and aft 

ARC 1 

73 

43 

50 

52 

49 

50 

50 

50 

50 

2 

25 

82 

48 

56 

58 

56 

56 

56 

56 

3 

9 

12 

40 

23 

27 

28 

27 

27 

27 

4 

23 

2 

3 

10 

6 

7 

7 

7 

7 

5 

11 

2 


— 

1 

i 

1 

1 

1 


141 

141 

141 

141 

141 

141 

141 

141 

141 


4. Stratified population: more than two grades 

Let us now split up the higher grade as well. We have then, say, k grades, with grades 2 and 
above forming the aggregate which was simply called the higher grade in §2; grade 1 is 
identical with the lower grade of that section. 

We assume further that promotions from any grade into the next higher one take place 
at the end of every year and that every promotee into any grade has to stay there for at least 
one year. Thus in any population the lowest possible age of grade g is g — 1 . The actual lowest 
ages may be different, because the first promotion rates different from 0 may concern higher 
ages than these. The rates of promotion can be different from grade to grade, but depend 
within each grade only on the age, as before. 

We shall again investigate whether it is possible to keep the total numbers of every grade 
constant, even if the age distributions of the grades are changing. 

We have seen that the age distribution of the total population, after t years, is 

$0 fit-lPl' •••> $1— nPrr 

The distribution of grade 1 is, at the same time, 

4>t> <fit-lPl> •••» <f>t-nPn’ 

and it is assumed that the set of p { is not identical with the set of p { . Hence grades 2 and above 
will have the age distribution 

fa(Po-Po)> <t»l-l(Pl-Pl)> <f>l-n(Pn-Pn)- 

Let us assume that <j> t ~APv — %) is the first item in this series which is not zero. Clearly we 
have 1. Then, as far as numbers of members (and not their individual careers) are con- 



S. Vajda 


249 


cemed, this aggregate of grades 2 and above is equivalent to a population which has arisen 
from successive annual entrants <j>t~ v (p v — p v ) who have been subject to rates of survival 


n Pv+\ 




Pv Pn 
Pv-P/ 


It must be understood, however, that survival’ is here a balance between deaths and 
promotions into the grade, so that these rates may very well exceed unity. 

The number of annual entrants into grade 2 is given by 


fa-viPv-V,) = + £/<(< - V) z'r] (Pr - V.), 

n n 

where x x , . . . , x m are the common roots of £ PiX n ~ { = 0 and £ ViV 11 ” 1 = with multiplicities 

t ~0 {~ 1 

k ( and j { respectively, and where the P i are polynomials whose order does not exceed either 
ki — 1 or j [ — 1. (They may all be identically zero.) 

n 

The x t are, of course, also root s of £ (p t — p^x 11 * 1 = 0, with multiplicities given by the 

i^v 

smaller of k t and j 

We ask now if it is possible to construct grade 2 alone in such a way that its total remains 
also constant. The argument which has been used in § 3 shows that this is possible if another 
equation of grade n — v can be found whose coefficients u\> say, are not larger than the corre- 
sponding (and ie 0 =1), which has once again the roots x v ...,# m , with multiplicities g t at 
least. If g A + ... 4* g m = n- v, then this is clearly impossible. If g x + ... +g m is smaller than 
this value, then we can trv to find such an equation. The initial population can also be then 
found, if we multiply the initial population of grades 2 and above by The grades 1, 2 
and the aggregate of 3 and above can then be constructed and every stratum kept constant, 
but with changing age distributions. 

We can proceed in the same way and find at each step whether further splitting up is 
possible beyond 3 grades, 4 grades, etc. It is seen that in general, if Eg f — n — m, and if 
grade g starts in fact at age g — 1 , then m f 1 grades can exist. 

The smallest value of £g f is 1 , and in this extreme case n grades can be constructed, i.e. 
one less than the number of ages. The nth grade will then contain the ages ft — 1 and n. 
Further, since x t is a root of x — x t = 0, the age distribution of this highest grade is 


(r, is, of course, negative). 

Example. We use again the same example as before. The characteristic equation for the 
whole population was 

x b 4- fr 4 4- 4- ^ x 2 4- -^$x 4- = 0, 

and that for grade 1 alone 

x^f^ + l^ + ^x^^x + j^ = 0. 

The difference between these two equations gives the equation for grade 2 and above 

x* 4 - £# 3 4 - $x 2 + $x + = 0 . 

This equation has, of course, the roots — J 4 - £ which are common to the two 

characteristic equations of the fifth degree, and also a further root — Now there is 


17*2 



150 The stratified semi-stationary population 

i biquadratic equation with the three specified common roots and not larger ooeffioients 
and having the coefficient of x 4 equal to unity), viz. 


x* + $v x * + H x2 + Tt x + ?fa = °- 

The fourth, irrelevant, root is — 1/5. This equation leads to the following development: 

Grade 2 only 


t 

0 

1 

2 

3 

4 

5 

0 

7 

8 and after 

Age 1 

73 

43 

50 

52 

49 

50 

50 

50 

50 

2 

18 

00 

35 

41 

43 

40 

41 

41 

41 

3 

0 

8 

2« 

14 

18 

19 

18 

18 

18 

4 

11 

— 

1 

5 

2 

3 

3 

3 

3 

5 

4 

1 



— 

— 

— 

— 

■ ~ 

— 


112 

112 

! 

112 

112 

112 

! 

112 

112 

112 

112 





Grade 3 and al 

)ove 




Age 2 

7 

22 

13 

15 

15 

10 

! 15 

f 

15 

I 

! 15 

3 

3 

4 

14 

9 

9 

9 

9 

9 

1 9 

4 

12 

2 

2 

5 

4 

4 

4 

4 

1 4 

5 

7 

1 

— 


1 

— 

1 

1 

! 1 


29 

29 

29 

29 

29 

29 

2» 

[ _ 

1 M 

29 


Analysis into further grades is impossible in this case, because the characteristic equation 
>f the third grade does not have any roots apart from the three common roots of all previous 
equations. 

5. Promotion rates dependent on seniority 
Ne still consider more than two grades, but now we will assume that the promotion rates do 
lot depend on the attained age but on the seniority, i.e. on the time spent in the grade, 
nstead. In the lowest grade seniority is equivalent to age, because all members were sup- 
posed to enter at the lowest age only. If we consider again the two grades of § 3, but this time 
,ake note of differences in seniority, we find the following pattern: 


Age 

Lower 

grade 

Higher grade 

Total 

Seniority 

0 

1 


x -4 

0 







1 






0«-iPi 

2 







X 

<t>l-rPT*x 

rPgiPi r-1 t r ) 

! 

... 







i 


■ 



Note . t t = pJPi and hence t 0 = 1. 



S. Vajda 


251 


If we consider now promotion from grade 2 into grade 3, and if we introduce u B , the prob- 
ability of not being promoted during 8 years from grade 2 (u 0 = 1), we see that grades 2 
and 3 (including higher grades, if any) will have the following constitution: 


Grade 2 


Age 

Seniority 0 

1 


tf-l 

1 

u o 




2 

1 *a) u o 

u i 



X 

Qt-xPxttx- 1 O u o 


... 

Qt-xPxtt 0~*l) u x-l 







Grade 3 


Age 

Seniority 0 

1 


x — 2 

2 

3 

•fit-iPtVo-* l) («0~Ml) 
<f>t-sPi[(h-h) 

+ (<0~<l) («!-«.)] 

^t-aPa^o * 1 ) ( w o“” w i) 



X 

<t>i~xP*[(tm- K-Wi) 

*+■ (^*-3 ~ ^*-a) ( u i “ tt a) + ••• 

+ (*o“*i) a ^fic— 1 ) ] 

( u o~~ u i) 

+ (*o — h) ( w «-3 ~ w x-a)] 


fit-xPxVo-tl) (Mo -Ml) 

___ 






It follows by means of the same argument as before that grade 2 can be kept constant if 
we can find the u { such that the equation 

* n “ipi(*o ~~ *i) + h) + (^0 *“ ^1) u \\ + ••• 

+ Pn[(^M-l-‘*w) + (^«-2“"^i-l) t/ l + ••• + (^0 ^l) w /i-l] = 0 

has the same roots which were common to x n + x n ~ l p t -f . . . +p n — 0 and 

which is identical with the difference of the first two equations of degree n , referring respec- 
tively to the whole population and to the lowest grade. We must further insist that all u t 
must have non-negative values, not larger than 1 . The coefficients of the powers of x must 
also be positive, but it is not necessary that u M ^ u if unless we do not admit reversions. 
If m is the number of common roots, then it follows again as in the last section that n—m + 1 
grades could exist which remain constant under the operation of promotions, but that their 
age and seniority distributions change. 

Example . Dealing once more with the same example as in the previous sections, we have 
to find a biquadratic equation 

+ *&[(!-$) + (H - f ) «i +(f±f- U) « 2 + (ft - IH ) “3 + (i -M) « 4 ] = 0. 





252 The stratified semi-stationary population 

or, if we use four significant figures in every fraction, 

x* + (0-5417 + 0- 5833m ,) x 3 + (0- 1973 + 0- 1058m! + O-1780m 4 ) X 3 
+ (0-01805 + 0-04274tt! + 0-03593 m 4 + 0-03869u a ) x 
+ (O + O-OO1389m 1 + O-OO3288m 2 + O-OO2704m 8 + O-OO2976m 4 ) = 0. 

This biquadratic equation must have the roots - \ ± and - f. If the fourth root is 
called ( - z), then the equation must be identical with 

(x 3 + fa: 2 + H* + &) (* + *)'= 0- 
Simple arithmetic shows then that 

u Y = 0-1429+ 1-71422, u 2 = 0-0459+1-90822, 
m s = -0-1283 + 2-25002 and m 4 = 0-0019 + 1-99022. 

Now 2 must be at least 0-05085 to make M a positive and it must not exceed 0-5, because 
otherwise the u { would exceed unity. But then m 4 will always be larger than m s , unless we put 
2 = f which would mean u ( = 1 for all i and then there would be no members at all in grades 3 
and above. It follows that wo must admit reversions from grade 3 into grade 2. We can then, 
for instance, take z = 0-2 and have 

m 1 = 0-4857, Mg = 0-4275, m 3 = 0-3230 and finally m 4 = 0-4011. 

The biquadratic equation becomes 

x* + H* 3 + H* 2 + TS X +2J0 = 0. 

This is the same as the one used in § 4, and we can again write down the changing pattern of 
the population, but this time taking also seniority into account: 


Grade 2 


Age 



/ = 

0 





1 





2 





3 



1 

73 





73 

43 




43 

50 




50 

52 




52 

2 

12 

6 

— 

— 

— 

18 

40 

20 

— 

— 

00 

23 

12 

— 

— 

35 

27 

14 

— 


41 

3 

3 

2 

1 

— 

— 

0 

4 

2 

2 

— 

8 

15 

0 

5 

— 

20 

8 

3 

3 

— 

14 

4 

5 

3 

3 

3 

2 

2 

1 

1 

11 

4 

— 

— 

1 


0 

1 

— 

1 

— 


1 

1 

2 

.. 

1 

i 

5 


91 

11 

6 

3 

1 

112 

87 

22 

3 

- 

112 

88 

19 

5 

— 

112 

88 

19 

4 

1 

112 




4 






5 





0 



7 and later 

1 

49 





49 

50 




50 

50 




50 

50 




50 

2 

28 

15 

— 

— 

— 

43 

27 

13 

— 

— 

40 

27 

14 

— 

— 

41 

27 

14 


— 

41 

3 

10 

4 

4 

— 

— 

18 

10 

5 

4 

— 

19 

10 

4 

4 ! 

— 

18 

10 

4 

4 

— 

18 

4 

5 

1 

1 

— 

— 

— 

2 

1 

. 

1 

1 

— 

3 

1 

1 

1 

— 

3 

1 

l 

1 


3 


88 

20 

4 

— 

— 

112 

88 

19 

5 

— 

112 

88 

19 

5 

— 

112 

88 

19 

5 

— 

112 




S. Vajda 


253 


Grade 3 


Age 

0 



1 



2 



3 



2 

7 

... 




7 

22 




22 

13 




13 

15 




15 

3 

1 

2 



— 

3 

2 

2 

— 

— 

4 

6 

8 

— 

— 

14 

4 

5 

— 

— 

9 

4 

4 

3 

5 

— 

— 

12 

1 

— 

1 

— 

2 

— 

1 

1 

— 

2 

1 

2 

2 

— 

5 

5 

1 

2 

2 

2 

— 

7 

— 

• — 


1 

1 



— 


— 

— 

— 


— 

— 

— 


13 

7 

7 

2 


29 

25 

2 

1 

1 

29 

19 

9 

1 

— 

29 

20 

7 

2 

— 

29 


4 

5 



0 




7 and later 


2 

15 

_ 

_ 



15 

10 




16 

15 




15 

15 




15 

3 

4 

5 

— 

— 

— 

9 

4 

5 

— 

— 

9 

4 

5 

— 

— 

9 

4 

5 

— 

— 

9 

4 

r> 

1 

1 

1 

2 

— 

— 

4 

1 

1 

1 

2 

— 

4 

1 

1 

1 

2 

— 

4 

1 

1 

1 

1 

2 


4 

1 


20 

7 

2 

— 

— 

29 

21 

6 

2 

— 

29 

20 

7 

2 

! 

— ■ 

29 

, 

20 

7 

2 

— 

29 


We find, as before, that further splitting up of grades is impossible, if the total in each 
grade is to remain constant throughout the years. 


Summary 

This investigation deals with a stratified population, which is subject to (i) mortality, 
dependent on age, and to (ii) promotion rates, indicating the ratios of members of a grade 
which are transferred to the next higher grade at the end of the year. 

Section 1 concerns a population which is not yet stratified and formulae are deduced to 
calculate the number of entrants at time t, necessary to replace yearly deaths and thus to 
keep the total of the population constant. This number depends clearly on the mortality 
rates and on the age distribution existing at time t - 0. In general the population tends 
towards a limiting age distribution, the ‘intrinsic stationary population’. 

Section 2 considers two populations and conditions are derived for the case that they need, 
every year, equal numbers of entrants to keep them constant. 

Section 3 introduces the stratified population. Both mortality and promotion rates 
depend on the age, and they are independent of the time t. Under certain conditions one of 
the two populations considered in § 2 can be taken as the whole and the other as the lowest 
grade in it. It is shown how and when entries into the grade can, at the same time, replace 
both losses due to mortality in the whole population, and to mortality and promotion 
depleting the lowest grade. This can also be described by saying that the totals of both grades 
can be kept constant at the same time, although the age distributions change from year 
to year. 

Section 4 generalizes the results of the previous section for a population consisting of 
k grades. If the population is spread over n ages, then it is shown that up to n - 1 grades 



254 The stratified semi-stationary population 

may be possible in the most favourable case, suoh that they are all kept constant, whilst 
the age distributions all oscillate. Such a population is called semi-stationary. 

Section 5 introduces the oase which has been of actual importance in practical establish- 
ment work: the promotion rates are made dependent on the time spent in the grade instead 
of on the age. 

A numerical example is attached to § 1 and is carried through all stages to illustrate the 
results which emerge gradually in the subsequent sections. 


REFERENCES 

Leslie, P. H. (1946). On the use of matrices in certain population mathematics. Biometrika, 33, 183. 
Seal, H. L. (1945). The mathematics of a population composed of k stationary strata. Biometrika, 
33, 226. 



[ 255 ] 


A SIMPLE APPROACH TO CONFOUNDING AND FRACTIONAL 
REPLICATION IN FACTORIAL EXPERIMENTS 

By O. KEMPTHORNE, Rothamsted Experimental Station 
Introduction 

The design and analysis of factorial experiments was described in 1937 by Yates in consider- 
able detail. In his treatment Yates described first the 2 n system and then went on to deal 
with 3 W experiments and experiments of the 2 m 3 w type. The 2" system is capable of very 
easy explanation, but with experiments of higher order both the design and analysis become 
of increasing complexity. It is the purpose of this paper to present a general method by 
which factorial designs of the type p n may be examined, in respect of both confounding and 
fractional replication. The method will be described by explanation of the rules for the 2 n 
and 3 n systems and corresponds quite closely to that given by Fisher (1942). The present 
approach presents confounding and fractional replication as different aspects of the same 
process. Experimental designs suggested by Plackett & Burman (1946) are also discussed. 

The 2 n system 

In this system all combinations of ?i factors each at two levels are tested. The totality of 
treatment combinations may be represented by the points of an w-dimensional lattice, each 
side being of unit length. Let the factors be x v x 2 > and take n mutually orthogonal 
axes y x ...y n . The point (000 ... 0) will then represent the control treatment, (1000 ... 0) the 
treatment consisting of ^ at the upper level and all the other factors at the lower level, and 
so on. The treatment effect of x x is the difference of the means of the yields of plots receiving 
x x and those not receiving x x . It is therefore the difference between the mean of the plots 
represented by points lying on the plane y x = 1 and the mean of those represented by the 
points on the plane y x = 0. The interaction of x x and x 2 is the difference between the means 
of those plots represented by y x = 1 , y 2 = 1 or y x = 0, y 2 = 0 and those represented by 
y x = o, y 2 = 1 and y x = 1, y 2 = 0, i.e. the difference of the means of those plots for which 

Vi + lh = 2 or = 0 ( mod 2 )> 
and those for which y x -f = 1 (mod 2). 

Similarly, the triple interaction of x v x 2 and is the difference between the means of those 
plots for which y 1 + y i + y 3 = O (mod 2), 

and those for which S/i + ya + ^s = 1 (mod 2). 

This process can be continued to the consideration of the interaction of x v x 2> . . . , x n which 
is the difference between the mean of those plots for which 

3/i + y 2 + Vz+ ••• == 0 (mod 2), 

and the mean of those for wfiich 

Vi + + Vz + • • • + Vn = 1 (mod 2). 

In the w-dimensional space parallel hyper-planes may be drawn containing the points of the 
lattice, such that the total yield forming the positive part of an interaction is obtained from 



256 Confounding and fractional replication in factorial experiments 

a set of parallel hyper-planes equidistant from each other. Likewise the negative part is 
obtained from another set of parallel hyper-planes, each plane of which lies midway between 
two planes of the first set. 


The 3 n system 

With n factors at each of three levels the treatment combinations are given by an n-dimen- 
sional lattice, each side being of length two units and containing three points. The treatment 
contrasts may be described as in the 2 n system with some slight modifications. 

Any contrast in the 3 W system involves the comparison of three totals of the yields of 3 n ~ l 
plots, and may be represented by the comparison of the differences between the yields of the 
plots lying on three sets of parallel hyper-planes. For example, if n = 2 the lattice is as 
follows: 

y% 

o l 2 

0 

v* 1 

* V - . 

The main effect of x x is the difference between the totals of yields of plots for which y x = 0, 
y x = 1 and y x = 2. The I component* of the interaction of x x and x 2 is the difference between 
the totals of the yields of plots for which 

2 / 1 — 2/2 = yi-yi* 1 * and Vi-ih = 2 - 
The J component is given by the contrast between the yields of plots for which 

2/i + 2/2 = °> 2/i + 2/2 =1 > and 2 /i + 2/2 = 2 - 

Anticipating the extension to cases when n is greater than 2, the equations for the I 
component may be written as follows: 

X x X 2 (I 0 ) : y x + 2y 2 - 0 (mod 3), 

X x X 2 (I x ) : y x + 2y 2 = l (mod 3), 

X x X 2 (I 2 ) : y x + 2y 2 = 2 (mod 3). 

If x x and x 2 (and therefore y x and y 2 ) are interchanged, then X 2 X x (I 0 ) is given by the 
equations y 2 + 2y x = 0, X 2 X x (I x ) by y 2 + 2y x = 1, X 2 X x (I 2 ) by y 2 + 2y x = 2, all mod 3. But 
the equation y 2 + 2y x = 0 (mod 3) is identical with the equation y x + 2 y 2 = 0 (mod 3), since 
3,Vi + 3y 2 = 0 (mod 3), whatever the values of y x and y 2 \ JC 2 A r 1 (/ 0 ) is therefore equal to 
A 1 A 2 (/ 0 )« Subtracting the equation j/ 2 + %i “ 1 (mod 3) from the equation 

%i + %2 = 0 = 3 (mod 3), 

we get y x + ^y 2 = 2 (mod 3); X 2 X x (I x ) is therefore identical with X x X 2 (I 2 ). It is obvious 
from the equations given above for the J component that X r X 2 (J { ) = X 2 X x (J i ) for i = 0, 

1 and 2. 

* Yates’s terminology for the components of interactions is used where convenient, but it is more 
convenient to refer to J 19 J 2 and J 8 as J 0 , I x and 1 % respectively. 



0. Kempthokne 


257 


Considering the case n = 3 , it is easily seen that the second order interaction may be 
split into four parts each consisting of the contrasts between three totals. These may be 
represented by the following equations: 

(!) Vi + 3/2 + 3/s = 0 (mod 3), 

3fi + 3 /a+ 2 /a = 1 
. 2 /i+ 3/2+ 3/s = 2 

(II) y i + 2y i + 3/ a = 0 (mod 3), 

3/i + 2 3/2+ 3/s - 1 
3 /i + 2 3/2+ 3/s = 2 

(III) y x + 2/2 + 23/3 = 0 (mod 3 ), 

3 /i+ 3/2 + 21/3 = 1 

3 /i+ 3/2 + 21/3 = 2 

(IV ) i/j + 2 y 2 + 2 i/ 8 = 0 (mod 3). 

2/1 + 2y 2 + 21/3 = 1 

y, + 2y 2 +2y 3 = 2 

In order these have been named by Yates 

Z, X, Y, IF. 

It is interesting in passing to note the relations between Z, X, Y and IF for permutations of 
the order of the factors. It is obvious from (I) that Z is invariant for any change in order of 
the three factors X v X 3 and A' a . Interchanging y 2 and y 3 . equations (II) become equations 
( 111 ), so that A BC(X) = A CB( Y). The following interchanges may be easily verified (using 
the equation 3^ + 3y s + 3y a = 0 (mod 3) where necessary): 

ABC(X) = BCA(Y) = OAB(Y) = ACB(Y) = CBA(X) = BAC(W). 


From the equations, it is clear that Z, X and W may be computed in the way given by 
Yates, since 


Z — J{x x , J(x 3 , x a )}, 
X = l{x v l(x 2 ,x 3 )}. 


I — J\Xy> •I's)}, 
W = 1 {Xy, J (X 3 , X 3 )} , 


I(x 2 , x 3 ) and J(x t , x 3 ) being evaluated for each level of x v The extension to the case w = 4 is 
again obvious; the main effects, two-factor and three-factor interactions, follow as in the 
above, and the four-factor interaction may be split into eight comparisons of three totals: 


1 3/i+ 3/2+ 3/s+ = 1.2 (mod 3), 

II 3/i+ 3/2+ 3/s + 21/4 = 1,2 (mod 3), 

III 1 /y + y s + '2y a + y t = 0, 1 , 2 (mod 3), 

IV 3/i + 3/2 + 2i/ 3 + % 4 = 0, 1 , 2 (mod 3), 

V 3/1 + 23/2+ 3/s + 3/4 = 0. 1. 2 (mod 3), 

VI j fy + 21/3+ 1/3 + 23/4 = 0, 1, 2 (mod 3), 

VII y x + 2 y 3 + 2y 3 + y t = 0, 1 , 2 (mod 3), 

VIII y x + 2i / a + 21/3 + 2y 4 = 0, 1 , 2 (mod 3). 



258 Confounding and fractional replication in factorial experiments 

A a in the ease of two factors, the effect of permutations of the order on the components of 
X, Y, Z and W may be easily obtained. The four-factor interactions may be computed by 
putting the equations given above into the following form: 


1 = J{x lt Z}, 
II = J{x lt Y}, 

III = J{x v X}, 

IV = J{x v W}, 


V = I{x lt W}, 
VI =I{x v X}, 
VII = I{x v Y}, 
VIII = I{x v Z}, 


where the three components of W, X, Y, Z of x a , x a , x t (in that order) are evaluated for 
each level of x v 


The p n system 

The total of p n — 1 degrees of freedom, where p is a prime, in the analysis of variance of 
a p n experiment may be split into (p n — 1 )/(p — 1) sets of {p— 1) degrees of freedom, the 
contrasts being given by the following hyper-planes : 

Vi = 0,1,2 p- 1, 

y a = 0, 1 , 2 , . . . , p 1 . 

Main effects (mod p), 


Interactions of pairs of 
factors, e.g. of x 1 and x 2 


y p = 0, 1,2, ...,p- 1. 

< yi+ya = o,j, 2, ...,p- i, 

?/i + 2y 2 = 0, 1,2, ...,p- 1, 

\yi + (P- * )2/a = 0, 1 , 2 , ...,p— 1 . 


(mod p), 


and bo on to the interaction between all the factors which is given by the hyper-planes 

aiyi+a 2 y i + a 3 y 3 + ...+a n y n = 0,1,2 p-l (mod p), 

where a 1 equals 1 and a t ,a 8 , ...,o M each may take all values from 1 to p— 1. 


Simplification of notation 

The p n — 1 degrees of freedom in the p” system may be split into (p n — 1 )j(p— 1 ) sets of 
(p — 1 ) degrees of freedom, given by the above hyper-planes, but it is only necessary to 
specify one hyper-plane of each set of the parallel hyper-planes. 

All the comparisons may be denoted by yf 1 yp, ■•■,yi n , the symbol meaning that the 
comparisons are given by the hyper-planes 

<* 1^1 + 08^+ — + a »y n = 0,1,2 p-l (mod p). 

In order to obtain an enumeration which covers all the possibilities once and once only, it is 
necessary to use the rule that the factors are always written down in ascending order — 
i.e. yi ( yjjyt k > etc., suoh that i<j<k... and that a { = 1. 



0. Kempthorne 


259 


The 3 n SYSTEM IN the revised notation 

As an example, the 3* system will be examined in detail. The effects are represented by 
yi,y*,y 8 ; interactions between pairs, y x y t , y x yl,y x y t , y x yl,y a ys> V%yl\ interactions between 
all three factors, y x y a y s , y x y a y*, l/iVlV a> Vil/lyl Any other combination of powers of the y’a 
can be reduced to the above set. 

It is interesting to examine the interactions of the effects and interactions. In the case of 
the 2 W system, Yates refers to the generalized interaction of two interactions ABGD and 
CDE say, which is ABE. The interaction of effects or interactions A and B consists of AB 
and AB* in the 3 n system. 

(a) The interactions of main effects are obviously interactions between pairs of factors. 

(b) The interactions of main effects and two-factor interactions with one letter in common 
are two-factor interactions and main effects: e.g. the interactions of y x and y x y a are 

yiy» = yiyl and y{y\ = y a , 

and the interactions of y x and y x y\ are y\y\ = y x y a , and y\y\ = ?/ 2 - 

(c) The interaction between main effects and three-factor interaction are two-factor and 
three-factor interactions: 

Between Interactions 


<Ji and y x y t y t 2/?2/i*/a = tAvIvI = 

^andt/i VtlA lAVtVl = lAvlyt^ytyl 

.Vi and '/i.Va’/a yi'/tVa = Avivl = V\Vz 

>Ji and yi y\ y\ y\ yl y\ = y, y t y : „ y\ <A y\ = .V, !h 

l 

(d) The interaction between two-factor interaction* are exemplified in the following table : 


Between 

Interactions 

y x y t and y x yl 

a* 

ii 

,v?.v5 = </ 2 

y x y 2 and y,y. 

yi^y»> 

invlyl - yiyl 

y x y t and y t yj ] 

i 

ViiAy»< 

y\y\y\ = Viy» 


(c) The interaction between two-factor and three-factor interactions are exemplified in 
the following table: 


Between 

and 

ViV* 

y 

ViViy* 

yiVtvl y» 

Vi y\ 

y t y» 

y\y%y\ 

yiVtVs Vi 

ViV* 

y%Vz 

yiy\y» 

yivi VtVi 

vivlyi 



yiy» ytvl 

y 

y a 



260 Confounding and fractional replication in factorial experiments 

The interactions between two-factor and three-factor interactions are therefore two-factor 
interactions in some cases and main effects and three-factor interacting in the other cases. 
(/) The interactions between three-factor interactions are set out in the following diagram : 


Between 

2 /i J/a 2 /a 

V\ y%y* 

V iVaVa 

y iVaVa 

and 





ViViVa 

ViVtlA 

yiylv* 

i 

2 /i 2 / a » 2/3 

viVi’ y» 

Vi< yttfa 

1 

Vi> ViVa 

Viyl y» 
i y» 


Confounding 

Confounding or the allocation of treatment combinations to blocks implies the allocation of 
all the points of the lattice into p c sets, of p n ~ c points, such that the comparisons between 
these sets involve particular sets of p— 1 degrees of freedom. The aim of confounding is to 
reduce the effect of soil heterogeneity by reducing block size, but ensuring that the block 
comparisons have little possible practical importance. 

If comparisons A = ypyp 2/n” an( i -B = yi^yi 2 ••• £/«" are confounded, then so is their 
generalized interaction, i.e. all the products of these two, i.e. AB , AB 2 , ...,AB p ~ l . For, if 
the treatment combinations for which oc 1 y 1 ~{-a 2 y 2 - h ... +oc„y n is equal to 0, 1, 2, — 1 

are put into separate blocks and also those treatments for which f} x y x + /? 2 */2 + • ••+&, ?/#* ™ 
equal to 0, 1 , 2 , . . . , p - 1 , then (a x 4- A/? x ) y x + (a 2 + A/? 2 ) y 2 -f . . . + (a„ + A fi„)y„ is equal (mod p) 
to 0, 1, 2, 1 for all A from 0 to p- 1. 

The present approach to confounding of the 2" system is identical with that given by 
Yates and we proceed to consider the rather more complex case of the 3 M system. 

(a) 3 3 system 

(1) In blocks of 3 2 . Any three-factor interaction may be confounded. 

(2) In blocks of 3. We cannot confine the confounded degrees of freedom to three-factor 
interactions because the generalized interaction of any two reduces to a two-factor inter- 
action and a main effect. If two three-factor interactions, yiy 2 y * an d Vi ylVz are confounded, 
the 8 degrees of freedom for blocks may be described as follows: 


D.F. 


Vt 2 

ViVs 2 

ViV^y* 2 

Viyhs 2 

8 


We can, however, choose three two-factor interactions and one three-factor interaction pair 
for our block com nari sons. 



0. Kempthorne 


261 


(6) 3 4 system in blocks of 3 2 • 

It is immediately obvious that we can confound two two-factor interactions and two 
higher-order interaction pairs to give blocks of nine. The important point, however, is to find 
a design confounding only three-factor interaction pairs. 

We therefore evaluate the interactions of all pairs of three-factor interactions, which have 
two letters in common. These may be derived from the interaction of y x y z y z with the four 
three-factor interactions of y v y 2 and y t , which are as follows: 


Interaction of y x y 3 y 3 and y x y 2 y t 
Interaction of y x y 2 y 3 and ViVifl 
Interaction of y x y 2 y 3 and y x y\y A 
Interaction of y x y 2 y 3 and y x y\y\ 


ViViVlvl and y 3 y\, 
yiViVly* and y 3 y v 
yiVaV* and y 3 y\y x , 
ViylVi and y 3 y\y\. 


Obviously there are many designs for the 3 4 design in nine blocks of nine plots confounding 
three-factor interactions. Those which confound four-factor interactions must also confound 
two-factor interactions. The names of the confounded interactions and their squares (each 
of which corresponds ta the same grouping as the element itself) form a group with the 
identity and the equation y\ — i, for all i, and further work is presumably most promising 
on these lines. 


(c) 3 5 in blocks of 9 

There is no design confounding only three-factor or higher-order interactions. If one two- 
factor interaction can be sacrificed, a possible scheme of confounding is given by the 
following table of generalized interactions : 


Between j 


and 


i 

i 


U\!hlh 


V\ 

ihytyUil 


ih y'2 //5 


Ihlfilhih 

y 1 y\ yt y 4 y‘t 


lh yllh 


ihiiiih 




!h!h 

y^y^A 


This two-factor interaction is estimated by the comparison of three sets of nine blocks, 
and the accuracy of the estimate will be low. 


(rf) 3 6 in blocks of 27 

We may, for example, confound the following: 


2^1 2/22/3 

ViyiVt 

yiylylyiyi 

y*y*y\y'l 


*/* 2 / 4 .v« 

Vivbihyiy* y^AA 
viAy*A 

yi'Ay'iy* yiy%Ay*ylA 
y*y*y*yl y^JiAyl 


Three three-factor interactions, six four-factor interactions, three five-factor interactions 
and one six -factor interaction are confounded. If y % is omitted from all the above expressions 



262 Confounding and fractional replication in factorial experiments 

we obtain a 3 6 experiment in blocks of nine confounding one two-factor interaction, seven 
three-factor interactions, three four-factor and two five-factor interactions — that is, the 
design given above for the 3 5 system. 

Extension to more complicated cases 

Extensions of the above to more complicated cases should most easily be achieved by the use 
of group theory. The confounding of a p n design in p c blocks corresponds to a group of 
i(p c + 1) elements such that all except the unit element involve at least a certain number of 
letters. For most agricultural experiments each element should contain at least three letters, 
so that no main effects or two-factor interactions are confounded. The group is an Abelian 
group and if A and B are elements of the group so are AB,A B 2 , . . . , A B p ~ x . The order of each 
element is p> and if A is an element so are the first (p — 1 ) powers of A . This aspect is being 
followed, and it is hoped will yield results. 

Fractional replication in the 2* system 

Some principles of fractional replication have been worked out over the past few years at 
Rothamsted (Finney, 1945). In the case of a 2 n system, with factors n x ...a n say, a half- 
replicate might consist of those treatment combinations which form the positive part of the 
interaction A l A 2 ...A /r Each function of the plot yields consisting of the sum of one-half 
of them minus the sum of the other half then corresponds to two degrees of freedom. 
Alternatively, each degree of freedom has one alias, and the aim in fractional replication is 
to design the experiment so that the aliases of effects which the exj>erimenter wishes to 
measure are high-order interactions which could not possibly have practical significance. 

For convenience of presentation, we develop first the theory for the case of the 2 n system. 
Suppose that of all the points on t he lattice for the 2 ,? system, only those points for which 

#i + ?/2 + + . . . + Vn — 0 

are included in the experiment. Then the points on the hyper-plane y x = 0, also lie on the 
plane t/ 2 + ^3 + ••• 4* y„ — 0, and likewise those for which y x - 1 lie on the plane 

y* + 2/3 + • + .Vrj = E 

The contrast which we have denoted by y x is therefore identical with that denoted by 
y 2 t/ 8 y 4 ... y n . Again, if we suppose that only those treatment combinations are tested which 
lie on the hyper-planes 

<HVi + a »^2+ ^ + = 0. 

then the points will also lie on the intersection of these planes which is given by the equation 

(<*1 + Pi) Vi + (a 2 + A) 02 + — + ( a n + Pn) = 0 ( m °d 2). 

The points which lie on the hyper-planes 

71^1 + 72 ^ 2 +- +7**/,,. = <U (mod 2) 

will also lie on the planes 

(«i + 7 i) yi + (<*2 +y%)y 2 + • • • + K + y J y n = 0, 1 (®o d 2), 

(Pi + 7 i) ?/i + (Pi + y 2 ) 2/2 + • • ■ + (Pn + y n ) y n = o, 1 (mod 2 ), 

(*i + Pi + 7i) yi + 0*2 + Pi + y%) V2 + • • • + («n + A, + y») y n = 0 . 1 (mod 2 )- 

Changing to the simpler notation, these results may be obtained by equating to unity the 
symbols corresponding to the effects which the experiment cannot measure (as only treat- 



0. Kempthorne 


263 


ment combinations of the same sign in the function giving the effect are included) and 
multiplying the symbol corresponding to a particular effect by these symbols. Thus we put 

I = ypyp ... j£» - ypy{* ... = yp + ^yp +f * ... yS»+^, 

then the oontrast ypyp • ■ • yl n 

is the same as those given by 

y^l+Yiy^t+Yt, . . y^n+Yn > y^}^Yxyp+Yi ...y^n+Yn and y^l+Pl+Yiyp+Pt+Yt ... y%n+Pn+Yn l 

where each power is reduced modulus 2. 


2" SYSTEM WITHOUT SUBDIVISION INTO BLOCKS 

We now consider some of the possibilities of partial replication for the 2” system. The basis 
of designs with fractional replication is the choice of an identity relationship; most of the 
possible relationships are of no value, and we consider only those which yield the least 
possible confusion between main effects and first-order interactions. 


Half -replication 

n = 3. If we take 1 = yiy t y a , then y 1 = y 1 (y 1 y 2 y 3 ) = y\y a y a = 0 2 0 8 - Such a design which 
confuses main effects and two-factor interactions would not be of any practical use. 

n = 4. If we take I = y l y 2 y s , then the aliases are exemplified by 

Vi = 37*3/92/4 and 0i0a = 0304- 

Such a design would not be used unless the experimenter were confident that two-factor 
interactions were negligible. 

n = 5. If we take / = yiy%yzViy^ then the aliases are exemplified by 

Vi - 3/23/93/43/5 and y x y 2 = y a y t y B . 

A half-replicate with five or more factors is feasible when there is no necessity to remove 
heterogeneity by the use of blocks, since main effects will have aliases which are interactions 
of four factors at least, and two-factor interactions will have aliases which are interactions of 
at least three factors. 

Quarter-replicatiott 

Each degree of freedom will now have three aliases. For each value of n we give the identity 
relationship and typical alias relationships. 


n = 4. 
then 
ft — 5. 
or 

(а) Gives 

(б) Gives 

ft = 6. 


1 = 3/i ?/a = 2/93/4 = 2/12/93/93/4; 

0i = 0a = 010304 = 030304 and 2 /i2/ 3 = 2/92/3 = 2/i 2/4 = 0a 04- 

1 = 3/i 2/a = 2/93/42/5 = 2/i 2/a 2/3 3/4 3/6 («). 

1 = 2/i 2/a 2/3 = 2/3 2/42/5 = 2/13/92/43/5 (t>). 

2/i = 3/a = 3/i2/82/42/5 = 3/a3/a2/42/5 and t/i03 = */ 2 J/ 3 = 010405 = 0a040s- 
.Vi = 2/*!/3 = 2 /i 2/33/43/6 = 080405- 
J = 3/i 2/32/32/4 = 2/83/43/53/6 = 3/1^3/52/6; 


then 2/i = 2/a2/ 3 3/4 = 0103040506 = 3/a2/52/6 

and 0,3/9 = y 8 y« = 3/i2/a2/32/43/53/9 = 0509- 

n = 7. I = 2/13/92/32/4 = 2/43/53/63/7 = 010203050607'. 

then 3/i = 3/a3/83/4 and 0i0 8 = 0„0«. 

n = 8. / = 2/13/93/82/43/5 = 042/63/83/7 2/8 = 010909060708; 

then 0i = 03080«05 and 0i0j = 030405- 

Designs in quarter replicate are therefore possible when » is greater than or equal to 8. 

BJometrika 34 18 



264 Confounding and fractional replication in factorial experiments 

High-obdbr fractional replication 

In general, the existence of fractional designs of the 2 n system with fraction 2*, which will 
be useful where information on all main effects and two-factor interactions is required, 
depends on the existence of a group of 2 p elements, one element being unity and the other 
elements all containing at least five letters. No simple method has been found of enumerating 
such groups, but it is perhaps worth recording the following designs which appear to represent 
the greatest degree of fractional replication possible. 

(a) Eighth replication 

If we are testing ten or more factors at each of two levels, one-eighth of a replication will 
enable main effects and two -factor interactions to be estimated. An appropriate identity 
relationship is the following : 

1 = ft ft ft ft & = 2/i2/22/e2/7 2/s = 2/ 4 ^ 8 3^7 .Vs 

= 2/i 2/a 2/7 2/9^10 = 2/22/42/52/72/92/10 = 2/*2/ 3 ?/«?/k2/9?/io = 2/i2/ 4 2/52/62/«2/»2/to' 

Thus ten main effects and forty-five two-factor interactions may be estimated from a trial 
testing 128 of the 1024 possible treatment combinations. 

(b) Sixteenth replication 

If we are testing twelve or more factors a possible identity relationship is the following: 

1 = 2/i 2/22/ 3 2/42/6 = 2/i 2/a 2 /b 2/7 2/s = 2/ 3 2/42/52/e2/72/8 = ViVaV9Vio.Vn 
~ 2/3 2/4 2/53/9 2/10 2/n = 2/62/72/82/92/102/11 = 2/12/22/32/42/52/62/72/82/92/102/11 

= 2/l 2/32/62/9 2/l2 = 2/22/42/5^62/92/12 = 2/22/32/72/8^92/12 « 2/l 2/4 2/5 2/72/s2/9 2/l2 
= 2/2 2/s 2/e 2/io 2/n 2/12 = 2/i 2/42/5 2/6 2/io 2/n 2/12 = 2/i 2/32/7 2/s 2/io2/n S/12 = ^2 ^4 2/s 2/ 7 .Vs ^102/11^12- 
In this case twelve main effects and sixty-six two-factor interactions may be estimated from 
a trial testing 256 of the possible 4096 treatment combinations. 

The extent to which these designs will be of practical value depends very much on the 
existence of a sufficient mass of reasonably homogeneous material to test the large number 
of treatment combinations without the necessity of dividing the material into smaller 
batches and using the device of confounding. An experiment involving say 256 different 
treatment combinations is not large by modern standards. At Rothamsted, for example, an 
experiment involving 200 distinct treatments on 300 plots has been carried out for some 
years: this experiment was, however, made possible by utilizing the elimination of the 
effects of soil heterogeneity by highly complex confounding; the design, in fact, consisted of 
three 5x5 lattice squares necessitating seventy-five plots, and each of these plots was split 
into four subplots. The advantages of testing twelve factors, say, at the same time under 
virtually the same experimental conditions cannot, however, be ignored. Such an experi- 
ment should have more value, other things being equal, than two distinct experiments each 
testing some of the factors. An examination has not been made of the possibilities of reducing 
block size by confounding for the above two designs, but it is probably necessary to sacrifice 
a few two-factor interactions. 

The relationship between fractional replication and confounding 

It is clear that fractional replication and confounding are different aspects of the same 
process. A 2 n design of 2 p blocks may be described as a 1 in 2 p replicate of a 2 n +* design 
with no subdivision into blocks, by regarding the blocks as a 2P system in p factors. As an 



205 


0. Kempthobne 

example, consider the 2 6 design in y v y ti y 9 , y A and y b laid out in four blocks of eight and 
confounding y t y 2 y 2f y 9 y^y$ and yiy 2 y A y 5 ; superimposing two pseudo-factors b t and b Zt the 
experiment is a quarter-replicate of a 2 7 design in y v y 2 , y 3 , y 4 , y 6 , b v b 2 . The identity on which 
the quarter replicate is based is given by the equations 

b i = yiy*yz> K = VzV^ hh = Viy^y^Vz 

or the equation I = y 1 y 2 y z b l = y 2 y i y s b 2 = 

If we examine this equation in the same way as in the previous sections, we find that the 
design depends on the fact that the aliases of the following type may be ignored: 

Vi = yzVz b i = 2/i 2/32/4^6*2 = 2/a2/42/6*i & 2* 

VxVz = Vz b i = yxyzVzy^zK = y*y* b \K 

This example is worth pursuing. The design is frequently used with one replication only, 
the error being estimated from three-factor and higher-order interactions. We set out below 
the identity and 31 degrees of freedom together with all their aliases and their usual place 
in the analysis of variance — blocks (B), treatment ( T ), or error (E). For convenience of 
printing we denote the factors tested in the experiment by a , b, c, d, e instead of y^y^y^y^ 
and the block factors by x and y . Capitals are used for treatment effects thus conforming 
to present usage. 


I = ABCX 

= bbb f 

= ABDEXY 


A = BOX 

= ACDEY 

= BDEXY 

7 T 

B = ACX 

= BODE Y 

= ADEXY 

T 

AB =CX 

= ABODE Y 

= DEXY 

r 

C = ABX 

= BBF 

= ABC DEXY 

T 

AC = BX 

= ,4BBF 

= BCDEX Y 

T 

BC = AX 

= BDEY 

= ACDEX Y 

T 

ABC = X 

= ABDEY 

- CDEX Y 

B 

D = ABCDX 

= CEY 

S ABEXY 

T 

AD = BCDX 

= ACEY 

= BBXF 

7’ 

BD = ACDX 

= BOEY 

- .4 BX F 

7 T 

ABD = CDX 

= ABCEY 

= BXF 

E 

CD = A B OX 

= BF 

= .4 BCEX Y 

T 

ACD = BDX 

= AEY 

= BCEXY 

E 

BCD = ADX 

= BBF 

= ACEXY 

E 

ABCD = DX 

= .4 BBF 

= BBX F 

E 

K = ABCEX 

= BBF 

= 4BBXF 

T 

AE = BCBX 

= 4 BBF 

= BDXY 

T 

BB = .4BBX 

~ BODY 

= ADXY 

T 

/IBB = BBX 

= 4BBDF 

— BXF 

E 

CE = <4 BBX 

— BF 

ABCDXY 

T 

>4BB = BBX 

— 4BF 

- BCDXY 

E 

BBB = ,4BX 

= BBF 

^ACDXY 

E 

.4BBB = BX 

= 4BBF 

= BBXF 

E 

BB = ABCDEX 

= BF 

= 4BXF 

T 

ADE = BCDEX 

= 4BF 

= BXY 

E 

BDE = ACDEX 

= BBF 

= 4XF 

E 

ABDE = CDEX 

= 4BBF 

= XF 

B 

CDE = ABDEX 

= F 

= A BOXY 

B 

ACDE — BDEX 

= 4 F 

= BOXY 

E 

BODE = ADEX 

= BF 

= 4BXF 

E 

ABODE = BBX 

= 4BF 

= BXF 

E 


If we take for each linear function of the yields the alias involving the smallest possible 
number of letters, but remembering that x , y are pseudo-factors, so that X , Y and X Y are of 



266 


Confounding and fractional replication in factorial experiments 

equal importance and therefore XY should be regarded as a main effect and not an inter- 
action, we have the following allocation of contrasts to the three components of the analysis 
of variance: 

Blocks: X, Y, XY. 

Treatments: A, B, C, D, E. 

AB = OX, AC = BX, BC = AX, 

CD = EY, DE = CY, CE ss DY. 

AD, BD, AE, BE. 

Error: AY, BY, DX, EX, AXY, BXY, CXY, DXY, EXY, ACD, BCD, ACE, BCE. 

The four three-factor interactions could equally well be regarded as interactions between 
two-factor interactions and blocks. It would be anticipated that these would be smaller 
than the interactions of main effects and blocks. The purpose of the present exposition is 
to give a clear statement of the possible interpretations of the results of an individual 
experiment. Further remarks on the problem of interpretation are postponed to a later 
section in the paper. 


An example of fractional replication with confounding 

A design which has proved of practical utility is the half-replicate of a 2* experiment arranged 
in four blocks of eight plots. 

Call the factors y v y 2 , y 3 , y K , y b , y t . Then the best confounding is that in which, using full 
replication, the block differences are all third-order interactions, say 

ViyiViVv yaViy&y* and y 1 y 2 y i y 6 . 

But it is impossible to keep main effects and interactions clear with this confounding, 
whatever interaction is equated to the identity. 

If we take the confounded interactions to be of the type 


yiy*y*> y*yiy* 

and the interaction y x y 2 y^y^y^y^ to be unity, then the following interactions are also 

confounded: ■, 

VAM* ViyiVt and y 3 y t . 


It will be found by enumeration of the possibilities that one first-order interaction must 
be sacrificed. All main effects and the other first-order interactions will have high-order 
aliases. 

It is interesting to examine this design in the same way as the 2* above for the relations 
between block-treatment interactions and treatment interactions. 

There are, in fact, only thirty-two independent contrasts, and it is simplest to enumerate 
these by operating on the identity relationship with the thirty-two possibilities for the 2 5 
system omitting y 6 . As before, we insert block pseudo-faotors. For simplicity of printing we 
use A, B, C, D, E, F for the factors and X, Y for the block factors. Then 


I = ABCDEF, X = ABC, Y = CDE, XY « ABDE, 
and combining these into one relationship, we have 

I = ABCDEF = ABCX - CDEY = ABDEXY = DEFX - ABFY « CFXY. 



0. Kbmpthoenb 207 


A oomplete table of the aliases for this design follows: 


I 

■ ABCDEF 

— ABCX 

-DEFX 

= ODEY 

= XBFF 

= ABDEXY 

= CFXY 


A 

- BCDEF 

= BCX 

= ADEFX 

as ACDEY 

= BFF 

= BDEXY 

= ACFXY 

T 

B 

= ACDEF 

-ACX 

as BDEFX 

as BCDEY 

= AFF 

-ADEXY 

as BCFXY 

T 

AB 

= CDEF 

= CX 

= ABDEFX 

= ABCDEY 

= FF 

-DEXY 

= ABGFXY 

T 

0 

= ABDEF 

= XBX 

= CDEFX 

= DEY 

as ABCFY 

= ABC DEX Y 

— FXY 

T 

AC 

= BDEF 

= BX 

as ACDEFX 

-ADEY 

-BCFY 

- BCDEXY 

= AFXY 

T 

BC 

-ADEF 

= AX 

=a BCDEFX 

= BDEY 

- ACFY 

= AC DEXY 

- BFXY 

T 

ABC 

-DBF 

= X 

= ABCDEFX 

= ABDEY 

= CFF 

as CDEXY 

= ABFXY 

B 

D 

-ABCEF 

as ABCDX 

=:EFX 

= CEY 

= XBDFF 

as -4BFXF 

- CDFXY 

T 

AD 

as BCEF 

= BCDX 

= AEFX 

= ACEY 

= BDFY 

= BEXY 

— ACDFXY 

T 

BD 

= ACEF 

= ACDX 

= BEFX 

-BCEY 

= ADFY 

-AEXY 

as BCDFXY 

T 

ABD 

= CEF 

= CDX 

= ABEFX 

= ABCEY 

= DFF 

= FXF 

as ABCDFXY 

E 

CD 

= ABEF 

— ABDX 

- CEFX 

~EY 

= XBCDFF 

= ABCEXY 

- DFXY 

T 

ACD 

= BEF 

= BDX 

- ACEFX 

= AEY 

= BCDFY 

= BCEXY 

s= ADFXY 

E 

BCD 

= AEF 

= ADX 

= BCEFX 

= BEY 

= ACDFY 

= XCFXF 

= BDFXF 

E 

ABCD 

-EF 

= DX 

= ABCEFX 

-ABEY 

= CDFF 

= CEXY 

= ABDFX Y 

T 

E 

- ABCDF 

= ABCEX 

= DFX 

= CDF 

as ABEFY 

= ABDXY 

= CEFXY 

T 

AE 

- BCDF 

= BCEX 

= ADFX 

= ilCDF 

= BEFY 

= BDXY 

= ACEFXY 

T 

BE 

- ACDF 

= ACEX 

~ BDFX 

= BCDY 

as AEFY 

-ADXY 

= BCEFXY 

T 

ABE 

= CDF 

— C EX 

= ABDFX 

= ABCDY 

= FFF 

= DXF 

as ABCEFX Y 

E 

CE 

= ABDF 

= ABEX 

= CDFX 

= DF 

= ABCEF Y 

= ABCDXY 

= EFXY 

T 

ACE 

= BDF 

= BEX 

= ^4 CDFX 

= 4DF 

= BCEFY 

— BCDXY 

= AEFX Y 

E 

BCE 

~ADF 

= AEX 

= BCDFX 

= DDF 

as ACEFY 

= ACDXY 

a= BEFXY 

E 

ABCE 

= DF 

= EX 

= ABCDFX 

= ABDY 

= CEFY 

= CDXY 

= ABEFXY 

T 

DE 

= ABCF 

= ABCDEX = FX 

= CF 

= ABDEF Y 

= ABXY 

as CDEFXY 

T 

ADE 

— BCF 

as BCDEX 

= AFX 

= XCF 

= BDEFY 

= BXF 

as AC DEFX Y 

E 

BDE 

— ACF 

- ACDEX 

= DFX 

-BCY 

= ADEFY 

= AXF 

= BCDEFX Y 

E 

ABDE 

— CF 

= CDEX 

= ABFX 

= ABCY 

ss DEFY 

as XF 

= ABC DEFX Y 

B 

CDE 

= ABF 

= ABDEX 

= CFX 

as F 

= ABCDEFY = ABCXY 

= DEFXY 

B 

ACDE 

- BF 

= BDEX 

= ACFX 

= XF 

= BCDEFY 

= BCXY 

= ADEFXY 

T 

BCDE 

= AF 

= ADEX 

— BCFX 

=s DF 

= ACDEFY 

= ACXY 

- BDEFX Y 

T 

A BCDE = F 

= DEX 

= ABCFX 

= XBF 

= CDEFY 

as CXF 

= ABDEFXY 

T 


The partition of the degrees of freedom in the analysis of variance which would generally 
be made is the following: 

D.F. 


Blocks 3 

Treatments : Main effects 6 

Interactions 14 

Error 8 


31 

The table of aliases is condensed below by the omission of all aliases involving more than 
two factors — counting, as before, X Y as a single factor as well as X and Y. 

Effects A, B, D, E have aliases of at least three letters, but C = FXY and F = CXY. 
Effects AD, BD, AE, BE have aliases of at least three letters, but 


AB = CX = FY, AC = BX, BC = AX, CD = EY, EF = DX, 

CE = DY, DE - FX = CY, DF = EX, BF = AY, AF = BY. 

In an experiment in which block-treatment interactions cannot be assumed to be negligible , 
in relation to the effects it is desired to estimate, the interpretation of most two-factor 
interactions is difficult if not impossible. The following identities of practical interest exist 
for the terms which would be used to estimate the error: ACD, BCD , ACE, BCE have 
aliases of three letters and are either three-factor interactions or interactions between blocks 
and two-factor interactions, but ABD = EXY, ABE = DXY, ADE = BXY, and 
BDE m AXY. 



208 Confounding and fractional replication in factorial experiments 

This design is very similar in result to the fully replicated but confounded 2 6 design 
described above. 

Fractional replication in the 3" system 

Here we have to consider treatment effects assessed from powers of one-third of a complete 
replicate. Only those treatment combinations represented by points of the lattice lying on 
the hyperplane a l y 1 + a i y a + ...+a n y n ~ 0, or 1, or 2 (mod 3) 

will be included in a one-third replicate. 

A particular treatment effect is given by the differences between the means of those plots 
represented by points on the following three planes: 

PiVi + Piy 2 + ■••+p n yn = 0 (mod 3), 

PiVi + Pt ••• + p n y n = 1 (mod 3), 

PiVi + PzVz + ••• + PnVn = 2 (mod 3). 

It is obvious that the points lying on the first plane will also lie on the planes 

(A + Aa x ) y 1 + (/? 2 + Aa 2 ) y 2 + . . . + (/?„ + Aa n )t/ W = 0 (mod 3), for A = 1 and 2; 

the points on the other two planes will lie on these planes with 1 and 2 respectively on the 
right-hand side of the equation. 

The aliases of each pair of degrees of freedom are therefore obtained by multiplication of 
its symbol by y * iy * t 

and by its square. 

As an example, suppose a third replicate of a 3 3 design is based on the inclusion only of 
those treatment combinations represented by the symbol 2 / 12 / 2 ^ 3 ( 3/1 + 2 / 2 + 2/3 — 0 say), then 
the aliases are exemplified by the relationship y x = ViVlyl = y 2 2/ 3 . 

The confounding of one replicate of a 3 s experiment in three 

BLOCKS OF NINE PLOTS 

A frequently used design is the 3 3 in three blocks of nine plots, testing all combinations of 
three factors each at three levels. This design is formally a one-third replicate of a 3 4 design. 
Suppose the factors are y v y 2 , and y 3 and let blocks be denoted by the pseudo-factor 6; 
a three-factor interaction of y v y 2 , and j/ 3 , say y x y 2 y 3 , is usually confounded in order to keep 
main effects and first-order interactions free of block effects. 

Then 6 = y x y 2 y 3 or 1 = y x y 2 y 2 b 2 9 since ft 3 = 1. 

As in the case of the 2 6 design, we work out the aliases of each pair of degrees of freedom : 
each pair of degrees of freedom will in this case have two aliases: 


Vi = 2/1 yi 2/! & 

= y*y 8 6 2 

Vs V 3 = yivlylb 2 

= y x w 

y% — yiylys^i 

= y x y 3 bi 

y t yl = yiyib 2 

= ViVlb 2 

Vi = y^yW 

= ViVib 2 

yiViVa = ViViVab 

= b 2 

.Viy* = Vi y 2 yl& 

= y*b* 

.2/1 2/2 y 2 ^ViVib 

= y»b 

Vxy\ = y\yl h 

= y^ylb 

yivly 3 = ViVzb 

= y%b 

ViVi = yiy\y» b 
Viyl = vivlb 

= y*b* 

- ViVlb 2 

ViVlyl = V\b 

- y*y*b 



0. Kempthorne 


269 


Here again the identities could result in difficulty in interpretation — as of course could 
have been predicted from the examination of the possible arrangements in blocks of nine 
of the 3 4 design. The main effects may be regarded as clear, and three of the first-order 
interactions. The remaining two-factor interactions could be ascribed to differential effects 
of the factors on the three blocks. The three-factor interactions which are not confounded 
with blocks are also ascribable to interactions of main effects and blocks and may therefore 
be used to form an estimate of the error of these effects. 


General remarks on confounding 

The device of confounding is used almost without exception in agricultural experiments in 
order to reduce the block size to twelve or less plots. As the above results indicate there are 
two aspects which then need careful consideration, (a) the estimation of interactions, and 
(b) the estimation of the experimental error. 

The main purpose of the factorial design is the estimation of main effects and interactions 
between pairs of factors and thence of the eff ect of any one factor in the presence and absence 
of each of the other factors. It is clear that when it is necessary to remove soil heterogeneity 
by confounding, the interpretation of a small experiment involving a few factors may be 
exceedingly difficult because of the possibility of block-treatment interactions. It is possible 
to use the rule that a large contrast should be regarded as the interaction between 
whichever pair of main effects is the larger, but this rule will break down in some cases when, 
for example, the contrast has two aliases AB and CD , and effects A and C are large and 
B and D small. In the case of a series of experiments, a device which might be helpful is the 
use of permutations of the possible identity relationships, one at each centre. The modem 
emphasis in agricultural experimentation is on series of experiments at various places and 
in several years, rather than on individual experiments. Interactions of pairs of factors 
will be estimated correctly from a large series of experiments if treatments are assigned 
at random to blocks. 

The evaluation of two-factor interactions for individual experiments depends on the 
assumption that block-treatment interactions are small compared with the experimental 
error. Yates (1935) examined several experiments for the existence of such interactions and 
found no evidence of them. Since that time a large number of experimental results which 
can be used to provide information on the question have been accumulated, and an investiga- 
tion of these has indicated that block-treatment interactions are negligible and may be 
ignored (Kempthorne, 1947). 

With regard to the estimation of error, in so far as tests of significance are of interest, it can 
be said that the analysis of variance does provide a test of significance of the hypothesis that 
the treatments have an overall effect different from zero. In agricultural experimentation, the 
term error is used to denote block-treatment interactions. Thus in the simple randomized 
block experiment, it is possible to evaluate the difference between two treatments from each 
block, and it is the variability of this difference from block to block which is regarded as the 
error. In general, as there are usually few blocks, and the error of each comparison would be 
determined with poor accuracy, the errors of all the possible independent comparisons 
are pooled to give a common estimate. If the treatments were duplicated at random 



270 


Confounding and fractional replication in factorial experiments 


within each block, the analysis would be of the form (r being the number of blocks and 
t of treatments): 

- D.F. 


Blocks r — 1 

Treatments t— 1 

Treatments by blocks (r — 1 ) (t — 1 ) 

Within blocks rt 


2 rt~ 1 


The component * within blocks’ could more accurately be described as experimental 
error, but would not be used to evaluate the errors of treatment effects, since the experi- 
menter is interested in the constancy of treatment effects from block to block. There is 
therefore little point in actually carrying out such an experiment. In a factorial experiment 
with replication, the components which could be evaluated consist of replicates, effects and 
low-order interactions, high-order interactions, and interactions of treatments and repli- 
cates. On the assumption that the sum of squares for interaction of treatments and 
replicates is homogeneous, the mean square for high-order interactions will include the mean 
square for treatments x replicates plus a component of varianoe due to high-order inter- 
actions. When only one replication is used, it is assumed that the component of variance due 
to high-order interactions is small, and that the high-order interactions mean square can be 
regarded as an estimate of error. It is important to bear in mind that an individual agri- 
cultural experiment can give information only for a particular set of experimental conditions 
and that it is known from experience that place to place and year to year variability is 
considerable. It would therefore be uneconomical to utilize available resources to determine 
effects and their errors at a few particular places very accurately, but preferable to sacrifice 
replication at each place in order to have information over a large range of experimental 
conditions. 


Mixed systems 

It is not proposed to examine mixed systems of the type p m q n , where p and q are primes, in 
the present paper. It is clear, however, that the possibilities of complete confounding and 
fractional replication are very limited. A p’th replicate must obviously include p m ~ J 
combinations of the m factors combined with all the q m combinations of the n factors. For 
the examination of treatment aliases the system may be regarded as the product of the two 
separate systems. Thus ifp = 3, m = 2, g = 2, n = 3 and the factors are yxy^y^yW^ fhen 
a half replicate would be obtained by putting I = y%y\y^ The aliases which result are 
exemplified by the following : 

Vx = 2 /i 2/3/42/6. 2/1 2 / 2^3 = 

2/1 2 /a = 2/12/22/3*4*4 . *4 = *4 * 4 - 

Such designs with fractional replication or complete confounding are therefore useful only 
when the corresponding designs for the two separate systems are feasible. 

Comments on ‘the design of optimum multifaotorial experiments’ 

In a paper entitled ‘The Design of Optimum Multifactorial Experiments’, Plackett & 
Burman (1946) put forward designs more specifically for physical and industrial research, 
which are of interest from the point of view of fractional replication. In order to estimate the 



0. Kkmpthorne 


271 


effeot of varying nine components, of an assembly, each component having two possible 
values, a nominal ( — ) and an extreme ( + ), they put forward the following design which 
requires the testing of sixteen assemblies: 



1 

2 

3 

Components 

4 5 6 

7 

8 

9 

Assembly 1 

+ 

— 

— 

— 

+ 

— 

— 

4* 

4- 

2 

+ 

+ 

- 

— 

- 

+ 

- 

— 

4- 

3 

4- 

• + 

+ 

- 

- 

- 

+ 

- 

- 

4 

+ 

+ 

4- 

+ 

- 

- 

- 

4- 

- 

5 

- 

+ 

+ 


+ 

- 

- 

- 

4* 

6 

+ 

- 

4- 

+ 

+ 

+ 

- 

- 

- 

7 

- 

+ 

- 

+ 

-f 

+ 

+ 

- 

- 

8 

+ 

- 

+ 

- 

+ 

+ 

+ 

+ 

- 

9 

+ 

+ 

- 

+ 

— 

+ 

+ 

4- 

+ 

10 

- 

+ 

+ 

- 

+ 

- 

+ 

4- 

4- 

11 

- 


+ 

+ 

- 

+ 

- 

4~ 

4- 

12 

+ 

- 

- 

+ 

+ 

- 

+ 


4* 

13 

- 

+ 

- 

- 

+ 

+ 

- 

4- 

- 

14 

— 

- 

+ 

— 

- 

+ 

+ 

- 

4- 

15 

- 

— 

- 

+ 

- 

— 

+ 

4- 

— 

16 

_ 

_ 

— 

— 

— 

— 

— 

— 

— 


Yates put forward a similar design in his 1935 paper for the weighing of a number of small 
articles on a balance which required a zero correction, as an example of the estimation of the 
effects of independent factors. In his case there was a close formal analogy to the 2 n factorial 
system, and it will now be shown that Plackett & Burman’s design given above is a high- 
order fractional design of the type discussed in the present paper. 

Denoting the nominal values by unity and the extreme values of the nine components by 
a, b, c, d, e,f, g, h, k in order, the treatment combinations represented are l, aehi, abfi, abcg, 
abcdh, bcdei, acdef, bdefg, acefgh, abdfghi, bceghi, cdfhi, adegi, befh, cfgi, dgh. It is found merely 
by one-by-one examination of the three-factor interactions that all the above sets of treat- 
ment combinations occur with the same sign in the following: 

ABE, ACK, BCF, CDQ, DEH. 

The same will be true for all the members of the Abelian group of which the above five inter- 
actions are generators. The identity relationship is therefore: 

/ = ABE = ACK = BCEK = BCF = ACEF = ABFK = EFK 

= CDG = ABCDEG = ADGK = BDEQK = BDFG = ADEFG = ABCDFGK = CDEFGK 

= DEH = ABDH = ACDEHK = BCDHK = BCDEGH — ACDFH = ABDEFHK = DFHK 

= CEGH = ABCGH = AEGHK = BGHK = BEFGH = AFGH = ABCEFGHK = CFGHK 

The identities of interest to the experimenter are the following: 

/ = ABE = ACK = BCF = EFK = CDG = DEH; 
from these we derive the following aliases for main effects: 

A = BE = CK , F = BC = EK, 

B = AE = CF, G = CD, 

C - AK - BF = DG, H = DE, 

D - CG = EH, K= AC = EF. 

E = AB = FK = DH, 

In all oases, the contrasts estimating main effects are minus the contrasts estimating inter- 
actions. If, for example, the interaction of B and E is negative, and A has no effect, the 



272 Confounding and fractional replication in factorial experiments 

conclusion drawn by the experimenter will be that A has a positive effect. It is possible but 
rather difficult to imagine physical systems in which effects will not interact, and interpreta- 
tion of the results of experiments based on this design may often be impossible. With nine 
factors, it appears from the present work that the minimum number of combinations which 
should be tested is 1 28, that is one-quarter of a replication, though it is possible that by making 
less stringent assumptions about two-factor interactions, one-eighth of a replication might 
give intelligible results. A possible instance in which it might be feasible to use the designs 
discussed is when it is expected that only one or two of the factors have an effect, and the 
problem is to determine as quickly as possible which of the nine factors are responsible. 
An example in which a high-order fractional design was used in such circumstances with good 
results has been described by Tippett (1936). A detailed examination of all the designs put 
forward by Plackett & Burman will not be undertaken, but the lines on which such an 
examination would proceed and the broad conclusions which would emerge are obvious 
from the above examination of one of their simpler designs. 


Conclusions 

A method of examining fractional replication and confounding for some types of factorial 
experiments is described. The formal equivalence between the two is indicated and the 
implications of this equivalence discussed. Further progress will follow on group theory 
lines and this is being examined, together with the possibility of fractional replication when 
the fraction is greater than unity. The possibilities are explored of the estimation of main 
effects and two-factor interactions of many factors by testing only a small proportion of 
the possible treatment combinations. An examination on these lines is made of designs 
proposed by Plackett & Burman. 


REFERENCES 

Finney, D. J. (1945). The fractional replication of factorial arrangements. Ann. Eugen., Lond., 12, 
291-301. 

Fisher, R. A. (1942). The theory of confounding in factorial experiments in relation to the theory of 
groups. Ann. Eugen., Land,., 11, 341 53. 

Kemfehorne, 0. (1947). A note on differential responses in blocks. ./. Agric. Sci. 37, 245-48. 

Plackett, R. L. & Burman, J. P. (1946). The design of optimum multifactorial experiments. Bio- 
metrika, 33, 305-25. 

Tippett, L. H. C. (1936). Applications of Statistical Methods to the Control of Quality in Industrial 
Production. Manchester Statist. Sue. 

Yates, F. (1935). Complex experiments. J. R. Statist. Soc. Suppl. 2, 181-223. 

Yates, F. (1937). The design and analysis of factorial experiments. Tech. Cotnmm. Imp. Bur. Soil Sci., 
no. 35. 



[ 273 ] 


A COMPARISON OF STRATIFIED WITH UNRESTRICTED RANDOM 
SAMPLING FROM A FINITE POPULATION* 

By P. ARMITAGE, B.A. 

1. Introduction 

1-1. We are concerned in this paper with the problem of estimating the mean value p 
of a variable x in a population, by taking a sample which is in some way representative of the 
population. It has been realized since Bowley’s paper (1926), and more particularly since 
Neyman’s more comprehensive survey (1934), that a certain degree of precision in the 
estimate can often be obtained more economically by stratified random sampling (usually 
referred to merely as stratified, sampling ) than by unrestricted random sampling (usually 
called merely random sampling). In the stratified method, the population is divided into 
several strata, the sample size divided in some prearranged way among the strata, and 
sampling performed at random from each stratum. In unrestricted random sampling, a 
random selection is made from the whole population, and the method may be regarded as 
a particular case of stratification, where the number of strata is one. 

Some text-books deal briefly with stratified sampling. Wilks (1943) considers only 
infinite populations, and denotes by representative sampling what we should call a par- 
ticular type of stratified sampling (see §1-2). The subject is treated by Kendall (1946, 
pp. 249-52), but he makes no comparison with unrestricted random sampling. We shall 
begin by introducing several well-known results which will be needed later. 

r r 

1*2. The summation sign 2 will be used throughout for £ , and 2 for £ • In general, 

/— 1 k fc-1 

£ is used for a single summation, for a double summation, and the suffix k where no 

k 

summation is involved. 

We shall consider the following position: A population n of size N is subdivided into r 
strata, n k , of size N k (^W, = A). The variable x is distributed so that the mean and variance 
(divisor N k ) within n k are respectively /i k9 o%. It is required to estimate ft = j&Nj/iJN, the 
grand mean. 

Suppose a given sample size, w, is divided so that n k items are sampled at random from 
n k C£ n i — w). We may denote thejth observation from the &th sample by x ki ( j = 1 , 2 ,..., n k ), 
and the mean and variance of the kth sample by x k and s%, which are known to be unbiased 

estimates of fi k and — - respectively (see, for example, Kendall, 1943, p. 284). 

It seems intuitively obvious to take as pur estimate of ft, 

m = XNjxJN, (i) 

which is clearly unbiased. This is, however, not the only unbiased estimate which is a linear 
function of the x ki . For instance, J^NiXjJN also satisfies the conditions. Neyman (1934) 
has shown that, for fixed values of n k , the estimate given by (i) is the best linear unbiased 
estimate of / 1 , in the sense that its sampling variance is less than that of any other linear 
unbiased estimate. 

* Communication from the National Physical Laboratory. 



274 


Comparison of stratified with unrestricted random sampling 


The question now arises : given a sample size n, how shall we choose the n k so as to minimize 
var (m), where m is given by (i)? Bowley had not considered ‘best’ estimates, and he sug- 
gested that n k should be proportional to N k , i.e. 


nN k 

»* = y. 


(ii) 


Neyman (1934) showed, by the method given in § 2, that the values of n k which minimize 

var(n * )are .MH] 

Zfyr t <WlW- 1 )] 


n k 


_ nN k a k 


(iii) 


where <r' k = cr k <J[N k /(N k - 1)]. 

We shall refer to these two methods of defining the n k , by (ii) and (iii) respectively, as 
proportionate sampling, and optimum stratified sampling, denoting by m p and m a the estimates 
of p obtained from (i) by the two methods, and by x the estimate of p given by the mean of 
an unrestricted random sample of n from the whole population it. 

The optimum stratified method thus requires a knowledge of the <r k . In practice, we should 
never know the cr k exactly, unless the population had been subjected to exhaustive sam- 
pling, in which case p would be known exactly. Sukhatme ( 1 935) has shown that, at any rate 
for large N k , if the c r| are estimated from a preliminary sample, and the n k defined by using 
these estimates in (iii), there is a high probability that var (m 0 ) < var The efficiency 

of this method will of course depend on the size of the preliminary sample, and Sukhatme’s 
investigation only dealt with one value of this (15 from each stratum). In some oases we 
should be able to form a fairly good estimate of the cr k from past experience, and there would 
be no need for a preliminary sample. 

Another interesting comparison which has not been extensively investigated is that 

between optimum stratified sampling and unrestricted random sampling. Wilks (1943) deals 

with this for infinite populations, and obtains (pp. 88, 89) the result (in our notation), 

* 


var (m 0 ) < var (m p ) < var (x), 


(iv) 


the first equality holding only when all the cr k are equal, and the second only when all the 
p k are equal. (Our N k /N are replaced by p k , where p k is the probability that x, when drawn 
at random from it, is a member of tr k , so that, for instance, (iii) becomes 




SftOi* 


Representative sampling as defined by Wilks is what we should call proportionate sampling.) 
We shall show in § 2 that for finite populations, while the relation * 


var(TO 0 )«Svar(m„) 


(v) 


is always true, the equality holding only when all the <r' k are equal, it is not necessarily true 

^ at var(ra p )sSvar(r), (vi) 

* No confusion need arise from the fact that the symbol m 0 and the term optimum are still used 
when estimates of the <r k are used in (iii). 



P. Abmxtaozi 


275 


and in fact in the limiting ease when all the p k are equal, it is true that 

var (m p ) > var (x), (via) 

so that if the <r' k are also equal var (m 0 ) > var (*) ; (vi6) 

i.e. random sampling gives a more accurate estimate of the mean than any stratified sampling. 
We shall see, however, that in almost all practical cases (iv) is true. 


2. Derivation of formulae 

2*1. Results (iii) and (v). Using the notation of § 1*2, we have the standard result that 


(vii) 


(viii) 


var (x k ) = ( 8ee e -g- Wilks, p. 86). 

Therefore from (i), var (m) = £ 

The result (iii) may be obtained quite easily by finding the values of the n t which minimize 
(viii) subject to the condition £«, = n, using the method of Lagrange multipliers. Then, 
substituting (ii) and (iii) in (viii), and applying Schwarz’s inequality, we have (v). The 
following method is due to Neyman. 

It may be verified from (viii) that 

. . N — n xr , a J „ lN k rr' k S-tycr'A 2 1 „ / , £N,<r;\ 2 .. 

v ar(») - ^ M 

If we denote the three terms of (ix) by A, B and C, so that 

var (m) = A + B—C, 

it will be seen that A and C are independent of n k and, since B is non-negative, it follows that 
the values of n k which minimize var (m) must minimize B. Now B = 0 if and only if 

nN k (r' k 

which is (iii). For these values of n k , m = m„, and 

var (m„) = A — C. (x) 

If we define n k by (ii), so that m = m p , we see from (ix) that B—C, so that 

var(m p ) = i4. (xi) 

From (x) and (xi), we obtain (v), the equality holding only when (7 = 0, which is true only 

when <r' k — — = 0 for all k, i.e. when the cr k are all equal. 

2-2. Unrestricted random sampling. The variance of a random observation x from n is 



276 Comparison of stratified with unrestricted random sampling 

y,N(u — u) z 

where 8 is the weighted sum of squares of the fi lt i.e. 8 — . From (vii), 

<r s (N — n\ 

rW -?M 


var( 


N ~ n 


nN(N — 1) ^ 1 1 ' n(N-l) 

N-n ^, /A r ™, N-n „ 

~ nN(N-i)^ Nl l)<T ? + n(N-l) S ’ 


From (xi), var (m p ) = ~~ ^N,a ?. 

Denoting — }}^L by H, and by K, we have 

N-n „ N-n ' 

var<a:) = "sr / ' + (.v-i)» i 

iV-w, 

var K) = :v«r A - 
i/_A = — j^r“!) <W) 


and 


Now 


(xii) 


and if we regard each N t as being of the same order, 0(N ), then H-K is 0(N~ l ) i which means 
that when all the ji k are equal, S = 0, and so 


var ( x ) < var (m p ), 

which is (via) ; but as N -> oo, var ( x ) ~ var (m p ) 4- S/n, (xiii) 

giving Wilks's result (p. 88) that for infinite populations (vi) is true, the equality holding 
only when all the /i k are equal. 

From (v) and (via) it follows that for finite populations, when all the pi k are equal and all 
the crjj. are equal, (vi&) is true, i.e. in this case unrestricted random sampling is actually 
better than any stratified random sampling with the same sample size. 


3. General comparison 

3*1. From (ix), (x) and (xii), 

* - var (x) - var («,) - ( * ~~ * - JP-Q-R), (aiv) 

where P = iV 2 S^<rJ 2 - iV(£ifyr|) 2 ^ 0 (equality if all & x are equal), 

Q = n('£ t N l (/ l 2 -N'Za / l 2 ) < 0, 

R = N 2 ^Oi 2 — (S^crJ) 2 > 0. 

As N -> oo, P, Q and R are respectively 0(N Z ), O(N) and 0(N 2 ), and so we have the result 
that for infinite populations <j> ^ 0, which with (xiii) is easily seen to be equivalent to Wilks’s 
result (iv). 

In the finite case, however, by suitable choice of the and n we can make <J> either positive 
or negative. For instance, if the trj are all equal and n is sufficiently small, R predominates 
in (xiv), and 0 < 0. As n increases to N , (j> increases to 0. (By considering Q and R , it is not 



P. Armitage 


277 


obvious that in this case, but it must be remembered that (xiv) is only true if the n k 
are given by (iii), and this becomes impossible as n approaches N. This will be remarked 
upon below.) If the <r| are sufficiently unequal, P will predominate and <j> > 0 . In this case 

the factor ^ ^ in (xiv) will be positive, and <j> will decrease as n increases. 

The situations, then, in which (vi6) is likely to be true (provided that the n k are really 
given by (iii)) are when the / i k are nearly equal, and when N is small or the <r k are nearly 
equal. We shall consider some examples in § 4. 

3*2. In applying the procedure of stratification, we shall make two departures from the 
theory outlined above which will tend to nullify the advantages of the stratified method. 
The first is that, as was pointed out in § 1*2, we shall never know the cr k exactly, and the 
degree to which our estimates from which the n k were obtained are accurate depends on the 
circumstances. It seems quite likely that Sukhatme/s result will be fairly well applicable to 
finite populations, but there is an opportunity for research on this point. 

The second respect in which we depart from theory lies in the fact that, even if the (T k arc 
exactly known, the n k that we choose can never be exactly as given by (iii); first because they 
must be integers, which makes a considerable difference when n is small (the size of the 
smallest stratified sample from which an unbiased estimate of p can be made is clearly r); 
and secondly, n k cannot take values greater than N k . In this latter case, if the values of, say, 
ft of the n k , as given by (iii), are greater than the corresponding N k , we should let n k = N k 
for these s strata, and then set the other (r — s) values of n k proportional to the corresponding 
N k (r k . This will clearly decrease var (x) - var (m 0 ) as given by (xiv). For example, when 
n ~ N, we have <N — \ 

var (x) = var (mj = - fW - - y S = 0, 


but the right-hand side of (xiv) 


| {^vat / ^ 2 _(va^) 2 }>0 


(equality holding if all the erj are equal). In fact both these limitations will decrease the 
theoretical advantage (if any) of stratified over random sampling, and we must take them 
into account in assessing the relative merits of the two methods. 


4. Examples 

In the four examples illustrated by Figs. 1-4, var (m n ) and var(.r) have been calculated for 
different stratified populations, and ^ = log 10 {var (x)/var (m 0 )} plotted against c = njN, 
so that \jr< 0 if var (#) < var (m„). In each figure the different curves represent populations 
with the same er k% with the N k in the same proportions but with different magnitudes, and 
w r ith the )i k equal, so that S = 0. 

Example 1 . <r k = 2, 3, 4, N k oc 6, 4, 3 (N - 65, 26, 13). 

Example 2. rr k = 4, 5, 6, N k oc 6, 5. 4 (N 120, 60, 30, 15). 

Example 3. er k = 4, 5, 6, N k oc 3, 11,4 (N — 1 26, 54, 36, 1 8). 

Example A. ar k = 1 , 1 , 2, 3, 4, N k oc5, 5, 1,2,3 (A T = 128, 32). 

The first thing to be noticed about, the graphs is that in each one ijj increases, generally 
speaking, as n increases. Further, in any one example the range of c for which \[r < 0 increases 







P. Abmitagb 


279 


as N decreases; and in thia sense we can say that for small samples of proportionate size 
from a stratified population, the advantage (if any) of the stratified method decreases as 
N decreases. 

Secondly, the curves are not smooth. The reason for this is clear. In the optimum stratified 
method the n k are to be chosen approximately proportional to N k cr k (a second approximation 
is (N k +$)cr k ). In Example 1, the N k a k are all equal, and it follows that the n k should be 
nearly equal. If n s 0 (mod 3) this can be done, but forn m 1 , 2 (mod 3), var ( m 0 ) takes values 
greater than it would if fractional n k were allowed. This produces a rise in the curve of r/r 
for n s 0 (mod 3), which gradually disappears as n increases since the effect is much greater 
for small n . The same ‘period’ is noticeable in Fig. 2, but in Figs. 3 and 4, where the main 
‘ periods’ are respectively 15 and 30, the effect is smaller. 

We saw in § 3* 1 that, broadly speaking, the advantage of the stratified method decreases 
as the <r k tend to equality. This is illustrated by comparing Examples 1 and 2. In each of 
these the N k cr k are equal, but in Example 2 the cr k are proportionally more nearly equal. 
Comparing curves for about the same N (N — 65, 26, 13 in Fig. 1 with N = 60, 30, 15 in Fig. 2), 
we see that in Fig. 2 the range of values of c for which $ < 0 is greater than in Example 1. 

Fig. 3 has the same <r k as Fig. 2, but the N k ar k , and therefore the n k , are different. The 
curves are similar to those of Fig. 2, but the stratified method is still less advantageous 
(especially for small values of c). 

Example 4 has five instead of three strata, and there is quite large variation between the 
( T k and between the N k <r k . There is no doubt here that i/r > 0, the only exception being for 
N = 32, n = 5, where \jr — — 0*02. 

These examples may be said to give the maximum advantage to the stratified method, in 
the sense that the calculated values of var (m 0 ) depend on the best method of choosing the n k . 
If the <r k are not sufficiently well known to enable the best values of n k to be used, then we 
shall get a larger value of var(m 0 ). It must be remembered, however, that in all these 
examples we assumed that there was no variation between the /i ki a situation which would 
be very unlikely to occur in practice. Now it is clear from (xii) that if the same N k and cr k 
are considered as in one of the above examples, but the / t k are now unequal, the effect is to 
increase the value of var(z) by (N -n) 8j(N ~ l)w, where 8 = XA^//* — //) 2 /#; so, in any 
example where i]r < 0 for some particular values of N and w, we can reverse the direction of 
the inequality by choosing a sufficiently large value of <S, say 

8 0 - [ var (m a ) — var (.r) ] (A - 1) n/(N -n). 

In comparing different values (if 8 0 for different examples, it must be remembered that the 
order of magnitude of S 0 depends on the cr k and a suitable measure of comparison will be 
S 0 /(tI where erg is the pooled variance within strata = S^erf/JV. 

In Example 1, the largest value of S 0 is for N = 13, n = 4. Here var(m 0 ) = 1*9172, 
var {x) = 1*5577, and 8 0 = 1*917 = 0*231<rg. (If = 0, ji 2 = 2, /i z = 3*5, then 8 = 2*066.) 

In Example 2, the largest value of S 0 is for A r = 15, n = 4. Here var(m 0 ) = 24*647, 
var (x) = 19*119, and S 0 = 28*14 = 0*289<rg. (If//, = 0, /i 2 = 7, // 3 = 13, then 8 = 28*51.) 

In Example 3, the largest value of S 0 is for N = 18, n = 3. Here var(m (> ) = 46*235, 
var (£) = 30*523, and S 0 = 53*42 = 0*51 5(rg. (If /e, — 0, = 8, // 8 = 17, thenoS = 57*5.) 

In Example 4, the largest value of S 0 is for N = 32, n = 5 (the only occasiofi in this example 
where \jr< 0). Here var (m 0 ) = 0*91406, var (x) = 0*87097, and $ 0 = 0*2474 = 0*049crg. 
(If fi x = //, 2 == 0 and // 3 = /i A — fi§ = 1, then 8 = 0*285.) 

Biometrika 34 


19 



280 Comparison of stratified with unrestricted random sampling 

5. Conclusions 

We have seen in § 3 that optimum stratified sampling may give a less accurate estimate of 
ft than unrestricted random sampling when the ji k are nearly equal, and when N is small or 
the (r' k are nearly equal. The examples of § 4 bear out these conclusions and show that the 
effect is greatest for small n, Fig. 3 providing an additional suggestion that if the products 
N k cr k are widely different the advantage of the stratified method tends to be nullified. In 
practice, we should probably only apply stratified sampling if we knew that the strata were 
sufficiently distinct to ensure considerable variation between either the /i k or the <r k . In 
the first case, if nothing much was known about the cr k and a preliminary sample on the lines 
suggested by Sukhatme was impracticable, we should use proportionate sampling, and the 
size of 8 would usually ensure that var(m JE> )< var(ir). In the second case, we should use 
optimum stratified sampling, and rely on the variability of the cr k to ensure that 
var(m 0 ) < var (x). Since an adequate degree of knowledge about the cr k would be unlikely 
unless the N k were quite large, we should in this case almost certainly be safe in using the 
method. To the above considerations must be added the fact that if very inaccurate estimates 
of the <r k are used in (iii), then, whatever the nature of the population, the resulting procedure 
may be extremely inefficient. 

It must be realized, of course, that even if it were known that var (m 0 ) < var (x), it would 
not follow that the optimum stratified method would necessarily be the most convenient. 
It may be impossible, or at any rate inconvenient, to do any sort of random sampling, and 
some sort of quasi-random sampling may have to be used (see. e.g. Madow & Madow, 1944), 
but if the principle of random sampling is applicable the stratified method is not likely to 
be much more inconvenient, and in fact in most cases will be more convenient, than the 
unrestricted method. 

Summary 

The stratified method has been used in the past almost solely for large-scale social and 
agricultural surveys. Here the stratum sizes are large, and known results for infinite popula- 
tions apply. There seems no reason why stratified sampling should not be used to advantage 
for smaller populations, and it is important to know to what extent these results still apply. In 
this paper a comparison has been made with unrestricted random sampling in the usual case 
where we are interested in estimating the mean. The advantages of the stratified method are 
modified, but in most cases where the method is applicable it will be found to be worth while. 

The above work was carried out as part of the research programme of the National 
Physical Laboratory, and this paper is published by permission of the Director of the 
Laboratory. The author desires to acknowledge the assistance rendered by Mr D. V. Lindley 
who prepared the diagrams. 

REFERENCES 

Bowley, A. L. (1926). Measurement of the precision attained in sampling. Bull . Int . Inst. Statist. 

22, 16re livraison. % 

Kendall, M. G. (1943). The Advanced Theory of Statistics , 1. Griffin and Co. 

Kendall, M. G. (1946). The Advanced Theory of Statistics , 2. Griffin and Co. 

Madow, W. G. & Madow, L. H. (1944). On the theory of systematic sampling. Ann. Math. Statist . 
15, 1-24. 

Neyman, J. (1934). On the two different aspects of the representative method. J . Roy . Statist. Soc. 
91, 668-626, 

Sukhatme, P. V. (1936). Contribution to the theory of the representative method. Suppl. J . Roy . 
Statist. Soc. 2, 263-68. 

Wilks, S. S. (1943). Mathematical Statistics. Princeton University Press. 



[ 281 ] 


SOME THEOREMS ON TIME SERIES. I 

By P. A. P. MORAN 
Institute of Statistics, Oxford University 

One of the principal problems in the theory of time series is to discuss the relation between 
two series, and in the present paper we prove a theorem by which we can test whether two 
such series are independent. Such a test of significance must depend on the models which 
we assume for the probability processes which generate the series. In practice, the two most 
useful models are, first, that of a moving average of a series of independent random com- 
ponents and, secondly, the solutions of linear stochastic difference equations. 

Let ..., y(t-l), y(t), i/(«+l), ... 

be a sequence of independent random variables each distributed in the same distribution 
which we take to have zero mean and its second, third, and fourth moments finite. Then the 

time series generated by n 

X(t) = £ a-i^t-i) 

i-0 

is a moving average with weights On the other hand, consider a stochastic difference 
equation of the form X(t)+ ai X(t- 1) + ... +a h X(t-h) = i/(«). (1) 

In order that the solution of (1) for successive values of t shall form a stationary series it 
is necessary to impose the condition that the roots of the characteristic equation 

z h + a 1 z h ~ 1 + ... +a h = 0 (2) 

shall all lie inside the circle | z | = 1 (Wold, 1938, p. 53). When this is true the solution of (1) 
can be shown to be of the form 

X(t) = 2 oiMt-i), 

i- 0 

ao 

where the a t are certain functions of the roots of (2). In this case 2 | | is majorized by a 

?-o 

convergent geometric series. 

Thus we see that both the above models are included in the more general one in which 

we define X{t) as given by oo 

X(t) = 2 oc { y(t-i) f 
i-o 

GO 

where the 0 L t are any sequence of constants satisfying 2 | | <oo. Now suppose 

i-0 

«*+!). - 

is another sequence of independent random variables having a distribution with zero mean 
and finite second, third and fourth moments. We write 

r(0= £ A 

i-o 

00 

where 2 | A | c oo. To discuss whether two such empirical series of this form are correlated 

i-0 

we prove that the covariance n 

8 = ZX(t)Y(t) (3) 


19-2 



282 


Some theorems on time series 


tends, as n increases, to be distributed in the normal form about zero mean with a second 
moment which is a function of the a t and the /?<. We shall discuss later the calculation of 
this second moment from empirical series, in whioh case some care is necessary. 

We first illustrate our method of proof by considering the much simpler problem of deter- 
mining the asymptotic distribution of the sum 

T n =£x(l-s). ( 4 ) 

i-i 

We shall show that this asymptotic distribution is also, under certain conditions, normal. 
This result is interesting because it establishes a central limit theorem (and therefore a law 
of large numbers) for stationary stochastic processes of this type. The law of large numbers 
for Markov chains has been considered by several writers, in particular Bernstein (1927), 
who proves his results by using central limit theorems for non-independent components. 
His theorems cannot be applied in the present case, but some of the ideas of his methods can. 
Consider (4) above, where X(t) is defined by 

X(t)= S 
<•* 0 

oo 

and £ | | is convergent. There is no loss in generality in supposing that <► 

i-0 

i Kl<i. 

i-0 

Clearly E{T n ) = 2 S a, i E[ri(t—8—i)'\ 

»- 1 i-0 
= 0 . 


Write « E(r/*), c 0 = E[X(t)*], c, = £[I(«)X(i-«)]. 

Then c 0 = tr*(a§ + a? + . . . ), c, = er 2 (a 0 a, 4- a x a 4+1 +...), 

which are both clearly convergent. Moreover, 


R n - E(Tl) = »^[X(«) S ] + 2 V V E[X(t -i)X(t-i- s)J 

. i- 1 s-1 


= |rec 0 + 2 (»-i)c 4 j. 

n~ l R n tends, as n increases, to R 0 = ^c 0 + 2 2 c <^ 


if this series converges absolutely. We shall show that lim w -1 fJ n is finite. For R 0 is clearly 


and this is finite. Moreover, we notice that n~ l R n is not greater than 


( 5 ) 


00 

We must now impose the condition that 2 «i is not zero. This condition is necessary to our 

i-0 



P. A. P. Moran 


283 


method of argument. If it is not zero, it may be assumed, without loss of generality, greater 
than a positive number. We now show that as » increases 


pr{t 0 (2R n )* < T n < 


tends, uniformly in t 0 and to 


77 ~*J e-^dt. 


We require the following lemma (Bernstein, 1927, p. 12): 

Lemma I. Let p n = Z n +<r n , 

where p n , E n and a n are random variables suoh that 

E(E n ) = E(<r n ) = 0, E(Z*) - H n , E(cr*) = H' n . 
Then if, for n large, pr{t 0 {2H n )l <E n < t x (2H n )*} 

n~* f e-^dt, 


tends, uniformly in t 0 and t v to 


J u 


then pr{t 0 (2J n )i ap n < <j(2J n )*}, 

where J n = E(p*) tends, uniformly in t 0 and t v to 


77 -*J e-^dt, 

Iff 

lim = 0. 


provided that 


71 — ► 00 

Let e be an arbitrarily small number and choose N so large that 

oo oo 

£ I a < I < e £ | a t - 1 < e. 

i-N i-0 


Write 


Xl(t) « £ Tn=£x ,(1-8). 

i-0 8-1 


Then E(T' n ) = 0, 

and write R' n - E(T' n 2 ). 

We shall prove that the distribution of T' n tends to normality, i.e. that 

P r{t 0 ( 2i?;)UT;« 1 (2i?;)*} 


tends, uniformly in t 0 and t v to 




e-^dt. 


We first calculate R„ and R' n in another way. For 

T n = £ -£(<-«) = £ £ a^t-s-i) 

s-l »-l i-0 

= a 0 J/(<-l) + (a 0 + a 1 )7?(<-2)+... + (a 0 +...+a„_ 1 )iy(t-n) 


and so R„ = E(T\) 


*.) - «•■{ 


+ £ (<*,+ ... +a g+n _i) #-«-»), 

8-1 

ag + (a 0 -t-a 1 )* + . £ («,+ ...+«,. 

8—1 


+ „-i) 2 }. 



284 


Some theorems on time series 


and this series converges. On the other hand, 

T'n = l) + ( a o + a i) V(*“ 2 ) + ... + («o + ... +a jv . 1 )^(< — 3T) 

n-JV 

+ s (Oo-f ... + a N )^-^-jp) 

p-i 

H-(a x + ... -fa^vr) ri(t-n- 1)4- ... + a iV 7p(J~n- JV), 
and so iZ* a = <r a {a§ + . . .(a 0 + a x ) 2 + . . . + (a 0 4- . . . + a^) 2 

4 {n — N) (oQ-f ... +a^v) 2 + (<x x -f- ... + a^) 2 + ... 4a^}. 


Sinoe we have already supposed that 2 a i is positive, there exist positive numbers N 0 and 

i-0 

N 

d such that for all N > N 0i 2 <*i > d. If this is not true the theorem is in general false. For 

i-0 

suppose the distribution of the ^’s to be non-normal and write ocq = 1, a 1 = ~l, a t = 0 
(t> 1). Then the distribution of T n does not tend to normality and its variance does not 
increase with n. We shall later show that this condition on the a’s is in fact satisfied for the 
solutions of stochastic difference equations. 

Now by the ordinary central limit theorem, as n increases, 

Tn = S (Oo +...+a N )ri(t-N-p) 

p-i 

tends to be distributed normally with zero mean and variance 

R'n = (» - N) (a 0 + . . . + ot N )*cr 2 , 
that is pr{« 0 (2iO*^'<M2iO*} 


tends, uniformly in t 0 and t v to 




e~ t2 dt. 


Using Lemma I we see that the same is true, for fixed N, when we replace T” by T' n and R„ 


by R’ n . Now 


T n = T^ + Q, 


say, where Q is what we get if we replace the sequence (a 0 , a x , ...) in T n by (a va , •••) ar *d 
alter t, and from (5) we can choose N so large that for n>N , n~ l E(Q 2 ) < e , say. Taking a 
sequence e l9 e 2 , ... tending to zero and choosing first N sufficiently large and then n and 
using Lemma I again, we see that 

^ T n < t x (2R n )^} 

tends, uniformly in t 0 and t v to 7r~* J e^dt. 


To complete the discussion we must show that the condition we have imposed on the 
sequence a 0 , a x , ... is satisfied by the coefficients of the solutions of stationary stochastic 
difference equations. Consider an equation 

.X(£) + & x X.{t — 1) + ••• 4- Qh X(t — h) = 7/(t) 

such that the roots of z h + a x z h ~ L + . . . + a h = 0 (6) 

all lie inside the circle | z | = 1. Then the solution of this equation is given (Wold, 1938, 
p. 53) by 

X(t) - 2 « iV (t-i), 

<-o 



P. A. P. Moran 


285 


where the » { are now the solutions of the infinite set of equations 

Ofl = 1 

cqa,, + <*! =0, 

0,0,, 4- +a t — 0, 


a h a 0'i' o ft-l a l+ •••+*& = 0« 

a h a l + ••• +°l a A + a A+l = 0, 


and since the left-hand side is an absolutely convergent double series, we add, obtaining 

(1 + 0 !+ ... +a h ) 2 ac { = 1, 

i-0 

00 

and so J a ( + 0 and, as already observed, without loss of generality, may be supposed posi- 
<~o 

tive. This quantity is finite because all the roots of equation (6) lie inside the circle | z | *= 1. 
Moreover, it follows that 

R 0 = ( 2 0-2 = (l+a 1 + ...+oJ- 2 «r 2 . 

This is, in fact, proportional to the derivative at zero of the integrated power spectrum 
(Wold, 1938, p. 69). 

We now turn to the problem of discussing the relation between two such series and we 
consider the asymptotic distribution of 

8 n = i X(-t)Y(-t), 

f-1 

00 

where X(t) = X, — (oq + 0), (7) 

i— 1 

and 7(0- £ (A + 0)- ( 8 ) 

i— 1 

We write in this form rather than that of (3) for the sake of convenience in what follows, 
and we have altered the notation of the sums (7) and (8) so that they begin with the 
coefficients ol 1 and fl 1 for the same reason. Writing 

c 8 = E[X(t)X(t-8)l d 8 = E[Y(t) Y{t — s)] (s = 0,1,...), 
as before, we have 

C-„ = °1( a l a *+l + a 2 a *+2+ "■)> d* = O-KAA+l + AA+S +"•). 
where trf and cr\ are the second moments of rj and £. Then 

W= S E(X(-t)Y(-t)) = 0, 

t-l 

x(-t) r(-<)} 2 

= £lsP(-()P(-t) + 2 "s Z( - <) Z( - 1 - a) Y( -t)Y(-t-s) 

U-l <- 1 »-l 

n-1 

= uCqAq -1-2 2 (ti — 5 ) c a d 8 , 

f-1 


( 0 ) 



286 t Some theorems on time series 

Consider the behaviour of n~ 1 E(S*) as n increases. Clearly 


n~ 1 E(S*)-*c 0 d 0 + 2 2 c„d, = C, say, 


( 10 ) 


if the series C is absolutely convergent. If X and Y are moving averages or the solutions of 
stationary stochastic difference equations this is certainly true, for in the first case the series 
is finite, and in the second it is majorized by a convergent geometric series. We show that it 
is true in the general case by the following argument. Without restricting generality, we 

00 oo 

may assume, as before, that £ | a< | < 1, 2 | A I < 1 • Then 

K I «ri(|<*i <*«! + •••) 

<01(1 <*i I + •••), 


and so 


£ C 8 d# 

l 


l l 

< (r ! <r l S (IAIIAI + -) 


s-l 




Also 


^(js | fit |) 2 . 
f’o rf o< o 'i (r i( { i l l 2 ) ( js l Pi I 2 ) . 


(ii) 


and so c 0 d 0 4- 2 2 c 8 d 8 is finite. We now prove that C is not zero. For 
i 

(7 = c 0 d 0 - 1-2 2 c ads = 0’i cr !F( a f + a|-h ...) (^f 4-/?!+ + 2 L S 2 

8- 1 L 8^*1 m=*l n—l J 

and after some rearrangement, this equals 

fffolKai A) 2 + (“i A + “2 A) 2 + («1 A + + a* A) 2 + •••]. 

and (a^)* is greater than zero and the rest non -negative at least. We therefore conclude that 

n~ 1 E(Sl)-+C, 

where Q<C<co. 


00 00 

Assuming as before that 2 I <*<!<!> 2|A| < *» 

1 1 


00 oo 


we define N so that 

Slf 

V 

W- 

V 


(12) 


00 00 

2 1 Pi 1 < e E 1 Pi 1 < e, where e is small. 

JV+1 1 


(13) 

We now write 

Jfj(<) = 2 ot { ri(t-i), 

i- 1 


(14) 


1 

< 

ii 

-*>* 


(15) 


<-l 


1 


and consider the sum 


( 10 ) 



P. A. P. Moran 


287 


We begin by proving that when n is large this sum tends to be distributed in the normal 
form with a variance which is asymptotically equal to nC v where C x is obtained from C by 
putting a t =* = 0 for i>N. For it is then clearly true that 

Now consider S' n = S -X\( - 1) - 1), 

l 

where n is greater than N. For convenience of notation, we write 

Vi = V(~ *)» Ci = £(-*)• 

We then have S’ — S 2 

i-lj-l 

where the A i} are certain constants. Moreover 

E{ViZ)) =0 all i, j, 

E(VKD = <r\*\ all i, j, 

mvUiU) = E( VjVk Q) = 0 for j + k, 

E(ViV)£kQ = 0 if or fc=M- 

00 00 

It therefore follows that E(8 {?) = oj<r§ S 2 -4?*- 

i-U-l 

Inserting (14) and (15) in (16) we have 

A i} — 0 

if i>v + N, or j > n + N or | i — j \ > N — 1 , 

and = Ei + + 2 4 , 

N N 

where Si = 2 2 ^uVi^f 

i- 1 j=X 

with Afj = CLifij + a,_! + . . . + « 1+W A for * >.?> 

— OCj/Jj + + a lA+f— i f° r * <7, 

= a,/tf,+ +a<^ for i = j. 

We also have 2 2 = EEAyi)^, 

where the sum is taken over values of i and j such that | i -j | < N. i ^ n, j ^ n and either 
N < i or N <j, where 

A tj = a i Ap+i + • ■ • + <*v- p Av for j-i = p>0 
-«p+iA+...+«ArAv-j. for i~j=P> 0 
= a l /J l + . • • + et N fi N for i = j. 

Then B(£ t ) = 0, tf(2*) = <r?<rt22^, 

where the sum is taken over the above values of i and j. This equals 
(n-N) <r\cr\[{ 0 Lj N )* + +... 

+ (®i A + • • • + a.v/i.v) 8 + • • • + ( a .vA + a jv-i A) 2 + («jv A) 2 ]* (1^) 



288 


Some theorems on time series 


We know that a x 4= 0. Let fi, be the first term of the sequence fl N , which is not 

zero. Such a term certainly exists. Then the sum in the outer brackets of (17) will oontain 

a term of the form (a*/?,) 2 and consequently E{E\) > 0, and for N fixed will increase as (n-N). 

Next we have „ „ 

= ZhAyHiiS,, 

where either i^n,j>n and j — i< N, 

or j^n, i>n and i —j < N, 

and A u = u p+l fi x + ... + ot x /i lX _ p for i -j = p > 0 

= a ifi P +i + • • • + for j - i = p > 0. 

Then E(I a ) = 0, £(£*) - ct\<t\Z£A%, 


(18) 

(19) 


where the sum is taken over the values (18) and (19). 


n + N n + A* 

Finally £4 — £ £ 

<«»+l >»=n+l 

where A Xj = a r _ + ... + ct N fi N _ p for i-j = p> 0 

= + . . . + a.v-pAv for j-i = p> 0 

= Upfip + . . . + a.vAv for i - j - n+p>n, 

n + N n+N 

and tfGW-O. E(Z,l) = <r* i <r | 2 2 

i-n+1 j = n-f 1 

We readily see that E(E { Ej) = 0 for i 

and therefore i£(>S^ 2 ) = iE7(27f) 4- jE^I) + + E(£D- 

Moreover, for constant JV, 2£(£1) an d are constant, and so for large n we have 


n-'El8?)-+G g » o'fo r l[(a 1 ^ A r) 2 + -f a 2 ^ A r) 2 -h ... 

+ ( a i A + • • • + a JV Av) 2 + • • • + ( a iv A) 2 ] + 0. (20) 


Now suppose that N is fixed and consider the sum 2 JC X ( - 1) Y x ( — t). We write 

i 

w + 

where p < 2m + 2V + 1 and n is large enough for m to be greater than N. This equation fixes 
m which increases roughly as n* when n increases. Write 

( N N+m m' + mN n \ 

2 + 2 +•••+ 2 +2 *!(-<) I'd -<) 

N + 1 n-p+lf 

= V x + U 1+ V z +U t + ...+V m + U m +W. 

Then V x , . . . , V m and W are all independent and E(V{), ...,E( F 2 ,) are independent of n, and in 
fact not greater than KN, where if is a constant independent of N. Also E( W 2 ) is not 
greater than K(2m + N + 1), where K may be taken as the same constant. U x ,...,U m are 
also all independent and E( C/ 2 ,) is asymptotically equal to mC 2 when n (and therefore m) 
are large. Therefore, writing 

A m = U x + ... + U m , B m = V x +...+V m +W, 



P. A. P. Moran 


289 


we have 


E(AJ - 0, E(A 2 J = 2 


t-1 


E(BJ- 0, *(*!)- 2 *(H> + *(""). 

i*»l 

and the latter increases as m, whilst the former increases as m 1 and so 
as ft increases. 

By Lemma I it is therefore sufficient to show that the distribution of A m tends to normality. 
Lemma II (Liapounoff’s Central Limit Theorem, Bernstein, 1927). If 

is the sum of m independent quantities such that 

E^) = 0, E(u { ^) = 6< m) , E(u^) = c ( r m) , 


and if, as m increases, 

where 

then 

tends, uniformly in t 0 and t v to 


Kn £ 4 m> 
r-1 

m 

b m - s 4 m) = * (SD, 

r-1 




e - ** 


To apply the lemma we put U T = u { r m) . We already have E(U r ) = 0. Also 

m~ l E( U 2 ) -> C 2 > 0 by (20), 
and so ra~ 2 6 m -> C 2 . 

Now consider 4 m) = E(U*), 


where 


^4 — 2 -^p8 2 /rlV+r-l)+p— 1 £rt/V+r-l)+e~ 1> 

p=l 8-1 


and the A m are calculated with m in place of n. Since the if a all have the same probability 
distribution and similarly for the C's, we shall write tj p and for i? f <A r +r-i)+p-i an d Sr(jv+r-i)+8-i 
for the sake of convenience. So we can write the above 


^4 ^ ^pqVp^q* 

p-1 8-1 

U* will be a polynomial of the fourth order in the r/'a and the £’s and its expectation may 
be regarded as the sum of two distinct types of terms so that E(U *) = EE(w 1 ) + EE(w i ), 
where the terms w t are of the form A pQ i] p ^, and the terms w 2 are of the form A^A^rj^^tj^Q 
with (p,q)=¥(k, l)- All other terms arising in the product will clearly vanish when the ex- 
pectation is taken. 

Then, since the A pg are bounded and the number of non-zero terms in w x and w 2 are not 
greater than 2 N(m + N) and 4N 2 (m + N) 2 respectively, we have 

E(U*)< Km 2 , 

where K is a constant depending on N but independent of m and n. It follows that 




1 



290 


Some theorems on time series 


is of order m -1 and tends to zero as n and m increase. The conditions of the lemma are there- 
fore satisfied and we conclude that 


pr{t 0 {2E(A* m ))* $A m < 


tends, uniformly in t 0 and t x , to n~* J V** dt. 

Applying Lemma I we have 

prUmSn*)]* <Sn< h[2 E(S?m 

tends, uniformly in t 0 and t v to the same limit. 

We now consider the relationship between and S n . Write 

-,l. UL««) 

+ 1 




^W x + W 2 + W a . (21) 

We must now calculate the variance of these terms. Consider again (9). We have shown 
(11) that 

2 2 < 2cr\<r\(\f } 0 1 + | | + • ■•) 2 

< 2<r?<T|( | a 0 1 + 1 oti | + ...)*, 

and we now apply this to the three sums in equation (21). It follows that if N be chosen to 
satisfy the conditions (12) and (13) then 

Wf) < A<r?<r|e 2 , Um»-^( Wl) < Ao>le 2 , fim n~'B{ W\) < Acrfirle 2 , 

where it is a constant independent of N. 

It follows that S„-S;+IF 1 + H' + HS, 


where the variance of W v W 2 and W a can be made small compared with that of S n by choosing 
N large. Then by first choosing N large and then n and using Tchebyoheff’s inequality, we 
see that the distribution of S n tends to normality with variance E(8*) ahd this completes 
the proof. 

In the general application of the above results some care is needed. We can suppose that 
our empirical values of X and Y are distributed about their sample means which we take 
to be zero and we must estimate the variance of S n from formulae (9), or (approximately) 
from (10). But we must not insert in this formula the sample covariances for the c, and the 
d t because, as Bartlett (1946) has shown, the standard errors of the sample values of these 
covariances are of order and we cannot therefore expect the series (10) to converge, let 



291 


P. A. P. Moban 

alone give the correct value. To use the formula correctly we must first decide on the order 
and coefficients of the stoohastic difference equation which we can suppose generated the 
series and, from these coefficients, calculate the value of (9). 

In the case where the series are generated by a three-term difference equation, the calcula- 
tions are simplified. Suppose the X and Y satisfy the equations 

X(t + 2) + aX(t + 1) + bX(t) — if{t + 2), 

Y(t+2) + AY(t+l) + BY(t) = £(< + 2), 
where E(ri(t)) — E(£(t)) — 0 

and E(r/ 2 (t)) = <rf, E(£ 2 (t)) = &i, 

as before. For the series to be stationary, we must have b< 1 , B< 1 . We suppose that in 
addition to this the series are oscillatory and so a 2 < 46, A 2 < 4 B. The solutions will then be 


X(l) = 2 2(46 — a 2 ) - * p*sin Os ?/(( — »+ 1), 

5- 0 

Y(t) = £ 2(4 £- ,4 ^-‘P'sin 0s £(*-«+!), 


5- 0 


where p = 6*, P — B *, cos 0 = — a(2b*)~ l , cos <j) — l . Also (Kendall, 1946, p. 408) 

c H p H sin (sd + ft) d s P 8 sin (8<j> + 0) 

r * = c 0 = sin^ ' “ ’ * = d 0 ~ ^ * ’ 


where 

and 


1 — 

tan = 7 — ~ tan 6 and tan0 = 


sin<P 
1 -P 2 


l+p 2 

1+6 

(1-6) {(1+6)* -a 2 \’ “° 

We then need to calculate 


c o - a i , 


<L 


1 + P 2 
1 + B 


tan0 




00 


C = c 0 d 0 + 2 2 c s <i s 

5-1 


— c 0 rf 0 


1+2 2 r, 

i=i 


.4. 


j f, i ® p*/ > ®8in(s0 + 0 r )sin(s0+<P)) 

C o® 0 | 1 + ,_i sin 0 sin0 J 

I 2pP r C QS(^— — $5) —pP CO&(lJ f —0 ) 

C ° °( ^ sin sin 0 L 1 — 2pP cos (6 — 0) + p 2 P 2 

cos (^ + # + # + 0)— pP cos (yfr -H^n 1 
1 - 2pP cos (0 4* 0) + p 2 P 2 J i 


( 22 ) 


It is probably easiest to calculate C from this equation rather than attempt to simplify 
(22) still further. I hope to discuss the* practical application of these formulae in another 
paper. 

REFERENCES 

Wold, H. (1938). A Study in the Analysis of Stationary Time Series. Uppsala. 

Bernstein, S. (1927). Math. Ann. 97, 1. 

Bartlett, M. S. (1946). Supp. J. Roy. Statist. Soc. 8, 27. 

Kendall, M. G. (1946). Advanced Theory of Statistics , 2. London: Charles Griffin and Co. 



[ 292 ] 


RANK CORRELATION BETWEEN TWO VARIABLES, ONE OF 
WHICH IS RANKED, THE OTHER DICHOTOMOUS 

By J. W. WHITFIELD, Psychological Laboratory , University of Cambridge 

Rank correlation is one of the most useful statistical techniques available for the treatment 
of data arising in experimental and applied psychological research. Chambers (1946) has 
indicated the type of data most frequently occurring in these fields, and has pointed out the 
advantages of Kendall’s r over Spearman’s p or any form of transformation to ordinal form. 

Given the use of r when tied rankings are present (Kendall, 1946) it seemed possible to 
extend the method to cover a very common problem in psychology, namely, determination 
of the relation between two variables, one of which is expressed as a ranking and the other 
as a dichotomy. In applied or field work the relation of a psychological 4 measurement ’ and 
an external criterion nearly always appears in this form. The usual method of determining 
the relationship consists of reducing the ranking to a dichotomy and calculating for the 
2x2 table which results. That this may lead to inaccuracy can be seen from the following 
example: 

Variable A l 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 

Variable jB+ + + + -+* + + — — — + + + ~ — — — 

Variable C — — — 4-4-4- + + + + - — — - — - — + + + 

Here the data are supposed to be ranked according to variable A and dichotomized into 
+ and — with respect to variables B and C. 

Treating the relation between variables A and B as a 2 x 2 contingency table : 


Variable B 



+ 

_ 

Variable .4: Rankings 1-10 

7 

3 

Rankings 1 1-20 

3 

7 


Applying ^ 2 , P is found to be 0*074 without Yates's correction for continuity, or 0*180 if 
the correction is applied. 

But x 2 is exactly the same for the contingency table relating variables A and C, although 
it is obvious from the data that there is considerable difference in the two relationships, the 
evidence for which is sacrificed by reducing the ranking to a dichotomy. 

If, alternatively, we consider the dichotomous variable as a ranking composed entirely 
of two sets of tied rankings, we may calculate the coefficients between A and B, A and C 
respectively which I shall denote by t ab , r AC . The corresponding values of 8 will be found 
to be, after the manner described by Kendall (1946): 

8 = +70- 9 + 21 = +82, 

S = — 30 + 49 — 21 = - 2. 

For the calculation of r in the case of tied rankings we have a choice in the denominator 
by which 8 is to be divided to give r. In the untied case this would be \n(n — 1 ), where n is 



J. W. Whitfield 


293 


the number of ranks. In the tied case we may take the denominator 8' as $»(»— 1) or as 
[$»(»— 1) {$»(»— 1) — \Et(t— 1)}]*, where t v t 2 , etc., are the extent of the ties. The choice is 
determined by practical considerations (see Kendall, 1946), but is not material to a discus- 
sion of significance. For an untied ranking and a dichotomy with ‘x’ and ‘y’ members in 
eaoh class, the second form reduces to {\xyn{n - 1)}*. 

In the case of two untied rankings Kendall has shown that var 8 = -fan(n— 1) (2» + 5). 
In the case of one untied ranking and one with ties of extent t v t 2 , etc., Sillitto (1947) has 
extended this result by proving that 


var 8 — V&{ n(n — l)(2w + 6 ) — 2t(t— l)(2< + 5)}. (1) 


In the case of an untied ranking and a dichotomy, t t = x, t 2 = y, (x + y) = n, and we have 
then the simple form var 8 = ^ X y(n + 1 ). (2) 

In the example above this giveR 


>«•«>- 26-46, 

VO 


82-1 


3 >v 7 7 V(var 8) 26-46 

The probability of a deviation greater than this in absolute value is 0-0022. 


= 3-06. 
Further, 


S A0 2-1 
V(var 8) 26-46 

and the corresponding probability is 0-970. 


0-0378, 


Variance when there are ties in the ranking 
The variance of S given by equation (2) is true only in the case of a dichotomy and an untied 
ranking. For a tied ranking I surmised from some special cases that 

Var 8 = 3«(w -T) ^ “ n) ~ ~ 

In the note following this paper Mr Kendall provides proof of this result. 

Example (from data collected by the Medical Research Council team in Germany 1946, 
as yet unpublished). Selected workers in a factory were interviewed and an assessment 
made of their adaptation to living conditions. They were assessed as ‘Efficient’ or ‘Over- 
active’. Other data were available, including statements by the men of frequency of nocturia. 
For men aged 50-59 years the following was observed: 


Assessment 

Hank order of frequency of nocturia 
(least frequent nocturia given highest rank) 

Efficient 

Overactive 

2i, 2J, 21, 21, 61, 61, 10. 10, 10, 10, 14, 14 

5, 10, 14, 16, 17 


Five is the highest ranking in the overactive group. Four members of the efficient group 
have higher rankings, and eight lower rankings. The S score for that member is therefore 
4 — 8. Similarly, for all members we have 

S =» 4— 8-f6 — 2-fl0+12+12 s -f- 34. 



294 


Rank correlation between two variables 


Using a denominator in the form 
r is given by 


= +- 7 ^ =+0-408. 


r + #12) (5) {*(17) (16) -1(4) (3) - f(2) (1 ) - j(5) (4) - f(3) (2)}] + J6960 = 

From (3) we then have 

Var ‘ S ' = 3(llf(i6) |(173 ~ 17)-(43 - 4) " (2S "“ ,) " (68_6) - (38 “ 3)} 

= 344 6. 


A small problem arises when we consider the correction for continuity to be applied in 
testing the significance of an observed value of 8. In the case of a dichotomy and an untied 
ranking the interval between successive S values is 2. In the case of a dichotomy and a 
ranking composed entirely of ties of the same extent ‘ t ’, the interval is 2 1. But in the example 
the ties are of varying extent, and the interval between successive S values is composed of 
a mixture of the intervals produced by the successive rank values. Thus, although these 
varying intervals are combined so that over most of the range the interval between successive 
values is unity, the distribution oscillates somewhat, and to use the value 1 as the correction 
for continuity would sometimes be misleading. Further work is required to determine the 
correction which will provide a probability on the normal distribution equal to or slightly 
greater than the true probability in all cases. Until this is available I propose to use a crude 
correction, based on the average of the intervals mentioned above. In the example the suc- 
cessive rank values 2£ and 6 give an interval of 5 in 8 score, rank values 5 and 6£ give an 
interval of 3, and it is therefore possible to determine the average interval by calculating 
the intervals given by successive rank values. This calculation can be shortened. The total 
of the 8 score intervals is twice the number of members, less the extent of the ties involving 
the first and last members. If we divide this by the number of intervals between successive 
rank values we have the average 8 score interval. In the example this is 1(34 - 4 — 1 ). Using 
half of this as the correction for continuity we have 


34-2-42 


V(v ar S) 


=* 1-702. 


The pre-observational hypothesis, made on psychological grounds, was that excessive 
nocturia is a symptom of inefficient adaptation to living conditions, i.e. a positive correlation 
should be obtained. From these observations the probability of a positive correlation as 
great or greater than the observed value appearing by chance is 0-044. Direot calculation 
of the positive tail of the distribution of S gives a probability of 0-0368. 

The alternative testing hypothesis based on the absolute value of 8 givqs a probability 
twice as great, and the corresponding direct calculation using both positive and negative 
tails of the actual 8 distribution gives a probability of 0-0736. 

By itself this evidence could only be debatable substantiation of the psychological 
hypothesis. In fact, additional data from two other factory groups, treated in the same 
way, gave a total 8 value of + 104, the square-root of the total variance being 36-00, providing 
a justification of the hypothesis. 



J. W. Whitfield 


295 


The case of the 2x2 table 

If one dichotomous variable can be considered as a ranking with two sets of tied ranks it is 
logical to consider the case when both variables are in this form. If. we have a 2 x 2 table 
in the form 


(AB) ■ 

(Ab) 

(A) 

(aB) 

(ab) 

(a) 

(B) 

(ft) 

N 


any member of (AB) taken with any member of (ab) has the same order in either ranking 
and hence contributes + 1 to S, and any member of {Ab) with any member of (aB) con- 
tributes — 1 . The others contribute nothing. Hence 

S - ( A B) (ab) — (Ab) (aB ). 

From equation (3) 


var 8 - 3^“ j ) l( ^ " N) ~ {{B? " (B)} " ^ ~ (A)}] 
_ [A)JaHBm 

N — 1 


( 4 ) 


Again, for testing the significance of an observed value of S it is necessary to correct for 
continuity by subtracting half the interval between successive S values. In the case of the 
2x2 table the interval is N, for if we increase (A B) by unity S becomes 

{(AB) + 1 } {(ab) + 1 } - {(A6) - 1 } {(aB) - 1} = (A B) (ab) - (Ab) (aB) + N. 

Hence, for the normal deviate, we have 


S-jN 




( A)(a)(B) (b)\ 
' N - 1 


( 5 ) 


It will be noted that r (taking the ties into account in calculating the denominator S') is 

(AB)(ab)-(Ab)(a B) 
lt( A)(a)(B)(b )] * 

which is the product-moment correlation for a 2 x 2 table when the variables are conven- 
tionally regarded as possessing the discrete values 0, 1 . 

Testing by use of the normal deviate seems to be moderately accurate, and would appear 
to be useful in those cases where is suspect because of small expectations in the cells of 
the 2x2 table. It is less laborious to calculate than the hypergeometric treatment, and is 
an alternative form of the approximation to hypergeometric treatment given by Pearson 
(1947), who also discusses the order of accuracy of the approximation. 

Using the data given earlier as an example, but assuming that it had been possible only 
to grade nocturia into ‘Normal’ or ‘Excessive’, we have the following table: 

Biometrika 34 20 




296 


Rank correlation between two variables 



Nocturia 

Assessment 



Normal 

Excessive 

Efficient 

10 

2 

Overaotive 

2 

3 


$ = 30-4 = 26, var S = 


(12) (5) (1 2) (5) 
16 


- 225. 


This gives, after correction for continuity, 

S-\N 26- 


■ 8-6 


= M667. 


^(var 8) 15 

This gives the probability of S being attained or exceeded in the direction of the hypothesis 
(i.e. positive values only) as 0-1217. x 2 without the continuity correction gives P = 0-0369,* 
and with the correction, P = 0-1143. The hypergeometric treatment, summing the prob- 
abilities of obtaining 3, 4 or 5 in the Overactive-Excessive category, gives P = 0-1 166. 

If the more customary test of absolute value is applied, x 2 with Yates’s correction gives 
P = 0-2286, 8 and the normal deviate gives P = 0-2434, i.e. both values of P are doubled. 
The hypergeometric treatment, adding the probability of obtaining 0 in the Overactive- 
Excessive category gives P = 0-2445. 

It will be seen that in conditions such as these, S and the normal deviate give a reasonable 
approximation to the exact treatment. 


* This is making the common assumption that {(AB)(ab)-(Ab)(aB)} 1 N/{(A)(a)(B)(b)) is dis- 
tributed as x* with 1 degree of freedom, or that its square root is a normal deviate with sign depending 
on the sign of (AB) ( ab ) - (.46) (oB). 


REFERENCES 

Chambebs, E. G. (1946). Statistical techniques in applied psychology. Biometrika, 33, 269, 
Kendall, M. G. (1938). A new measure of rank correlation. Biometrika, 30, 81. 

Kendall, M. G. (1946). The treatment of ties in ranking problems. Biometrika, 33, 239. 

Peabson, E. S. (1947). The choice of statistical tests illustrated on the interpretation of data classed 
in a 2 x 2 table. Biometrika, 34, 139. 

Snxrrro, G. P. (1947). The distribution of Kendall’s coefficient of rank correlation in rankings con- 
taining ties. Biometrika, 34, 36. 



[ 297 ] 


THE VARIANCE OF r WHEN BOTH RANKINGS CONTAIN TIES 


By M. G. KENDALL 


1. The variance of r in the population of sample permutations was given in my paper of 
1938 for the case where no tied ranks exist. Mr Sillitto (1947) has given the formula where 
one ranking contains ties but the other does not. In the foregoing paper Mr Whitfield has 
correctly surmised the variance when one ranking contains ties, and the other is a dichotomy. 
In this note I derive the general formula for the variance when both rankings contain ties. 
The results of Messrs Sillitto and Whitfield then follow as special cases. 

2. I shall follow the method of Daniels (1944). If a ^ represents the contribution of the 
ith and jth members of a ranking to r we have 


% = + 1 (i <j) 


= 0 (i=j) 


= -l (i>j) 

(1) 

We write c y = 

(2) 

where a and b refer to different rankings, and 


c = i c, r 

(3) 

i.i-i 


The quantity c is simply related to S by the relation 


c = 2 S, 

(4) 


and for the testing of r it is sufficient to test c or S which are merely constant multiples of r. 
I work with the quantity c. 


3. We have, from Daniel’s results, 


= »+l-2t\ 

i-i 

S a\ t = n(n- 1), 
M-l 

n 

£ a a a u — \n(n 2 - 1 ), 

i.U-1 


(6) 

(®) 


( 7 ) 

( 8 ) 
(9) 


E(c) = 0 , 

If we substitute from (6) and (7) in (9) we find 

E(c*) = |n(»-l)(2w + 5), (10; 

or, equivalently, E(S 2 ) = -fan{n — 1) (2n-f 5), (11. 

from which the variance of r in the case of untied rankings follows at once. 

4. Now suppose that sets of t v t 2 , ... consecutive members in one ranking are tied. Ir 
place of (6) we then have 


E a?, = n(n- l)-2*(*-l), 

t.i-i 


(12 


20-2 



298 


The variance of r when both rankings contain ties 

the summation on the right taking place over the various values of t. This result follows 
simply from the consideration that for a pair of tied ranks a i} — 0, and consequently the sum 
of squares of contributions from a tied set is of the same form as for the ranking as a whole. 
In place of (7) we have » 

S = (13) 

t, 1,1-1 

This is not quite so obvious. Consider a set of tied ranks. The contribution to the sum on 
the left of (13) will be unchanged if the suffixes l, t fall outside this set. If they both fall 
inside, no contribution arises and therefore we have to subtract the term \t{t 2 — 1). The re- 
maining possibility is that one falls inside and one outside. In such a case the contribution 
remains unchanged in total for it is zero in the original untied case, each possible pair occur- 
ring once to give + 1 and one to give — 1. Formula (13) follows. 


5. By substitution in (9) we then have, for two rankings with ties typified respectively 
by t and u, a 

E{c,) - 1 j-<»=2j <**» -«><—*>- m 


x {\n{n- 1) (n — 2)- JSm(m— 1) {u — 2)} 


+ rc (w 2 _ j j {n{n - 1 ) - Zt{t - 1 )} {n(n - 1 ) - S«(u - 1 )}. (14) 

This is the general formula required. We can express it in the alternative form 
E(c 2 ) = | n(n - 1 ) (2n + 5) - f »(< - 1 ) (2< + 5) - f 2u(« - 1 ) (2« + 5) 

+ - 1 >(*-«»{&•<«- 1) <« - *» 

+^ rT jW-i )} {Stt(u- 1) }. (15) 


6. (i) If one ranking is untied, say all the u' s are zero, we have Mr Sillitto’s result 

E(c 2 ) = |n(n-l)(2n + 5) — 1)(2« + 5). ( 10 ) 

(ii) If one ranking is untied and the other is a dichotomy into x and n — x = y members, 

(16) reduces to E(c 2 ) — \ocy(n + 1), (17) 

agreeing with Mr Whitfield’s equation (2). 

(iii) If one ranking contains ties and the other is a dichotomy we find on substitution 
m (14) 

- 3V?<» - j , -»-=(*■- <». (18) 

agreeing with Mr Whitfield’s equation (3). 

(iv) Finally, if both variates are dichotomized into x, y and p, q we find 


E(c 2 ) = 


4 xypq 
n - 1 ’ 


(19) 


agreeing with Mr Whitfield’s equation (4). 


REFERENCES 

See the references to Mr Whitfield’s paper together with : 

Dahtels, H. E. (1944). The relation between measures of correlation in the universe of sample per- 
mutations. Biometrika, 33, 129. 



{ 299 ] 


A x 2 ‘SMOOTH’ TEST FOR GOODNESS OF FIT 

By F, N. DAVID 


1 . The x 2 test occupies a central position in statistical theory, and it is difficult to imagine 
another test which would have the same generality of application. We shall be concerned 
here with one aspect only, that is, the uses of x 2 in the tests for agreement between hypothesis 
and observation which are usually loosely classed together under the name of tests for 
goodness of fit. The principal advantages of x 2 for such tests would seem to be (i) that it is 
applicable to grouped observations, (ii) that the parameters of the hypothesis tested may be 
calculated from the observational data and the fact allowed for in the degrees of freedom with 
which the criterion is assumed to be distributed and (iii) that it is easy to calculate, for the 
number of computations involved is just the number of groups into which the observational 
data are divided. It has, however, two defects which have long been known and which are 
easily recognized from the form of the criterion itself. Broadly speaking we may define 
X 2 as follows: 


X 2 — sum for all groups of 


(Observed value — Expected value) 2 
Expected value 


It will be seen (iv) that in taking the square of (Observed value — Expected value) the 
knowledge regarding the sign of the deviation is lost. Further (v) there is no means of 
preserving the order of the signs of the deviations, and no distinction can therefore be made 
between a departure from the hypothesis tested in which the deviations were first all positiv e 
and then all negative and a departure in which the sizes and signs of the deviations were 
random between themselves. 

2. The ideal test for goodness of fit should certainly take into account (i), (ii), (iv) and (v) 

and probably (iii) also, but it is unlikely that this ideal will be reached. It would seem at 
the present time that the most which may be hoped for are tests which will supplement the 
X 2 test in that they will be more sensitive to given alternatives of the hypothesis under test. 
Neyman (1937) put forward such a supplementary test in which he developed the ijr 2 
criterion. This criterion was designed to be sensitive to alternate hypotheses of a type he 
designated as smooth; that is to say, if the hypothesis under test is a continuous curve, such 
as, for example, the normal curve, admissible hypotheses alternate to it might be other 
normal curves with a different mean and a different standard deviation. Neyman’s 
criterion certainly took into account (iv) and (v), but it did not fulfil (i), in that it was only 
applicable to ungrouped observations,* or (ii), in that the parameters of the hypothesis 
tested had to be known a priori . Whether it fulfilled (iii) is a matter for personal 
opinion. f 

3. It would appear possible that tests which would take into account the sign only of the 
deviations of observation from hypothesis, and the order of these signs, may be devised 
from simple combinatorial principles. Suppose that there are N observations which are 
divided into k groups. Let n* (t = 1, 2, ..., k) be the actual number of observations falling 


* Prof. Neyman tells me that his criterion has been adapted for the case of grouped observations 
but that he has not yet published this extension. 



300 


A x 2 ‘ smooth ’ test for goodness of fit 

into the ith group, and let m i (i — 1,2 k) be the expected number. It is possible theoretic- 

ally for x 2 to be calculated for the case where 

i- 1 i-l 

but such cases must be rare in statistical practice. We shall overlook this case and will 
consider the case where the totals of observed and expected are made equal to one another 
with the resultant loss of one degree of freedom in the calculation of ^ a . If the totals agree then 

k k k 

2 n t - 2 = 2 ^ = 0 , 

i-l i-l i-l 

where S i = n i -m i . In order that the sum of these S' a should be zero, at least one of them 
must be negative in sign, but which one of these S ' s it will be would seem to be a matter of 
chance. It is on this fact that we shall base the first test criterion. 

4. Suppose that we have a sequence of Bigns of which r x are positive and r 2 negative, 
where r x + r 2 = r, and r x > 0 and r 2 > 0. These signs are postulated to oocur in a random order. 
Given such a sequence it is easy to record the number of sets of positive and negative signs. 
For example, if the sequenoe is 

+ + + + + - + + + + -, 

then r = 15, r l = 9, r 2 = 6, and there are four sets of positive signs and four sets of negative 
signs. In general there can be (a) t positive, t negative, or (/?) t positive, t + 1 negative, or 
(y) t + 1 positive and t negative sets of signs. If T = 2t or 2t + 1 as required, we may ask what 
is the probability that given r 1 and r 2 such a number T of sets (alternately positive and 
negative) would have arisen through chance. This probability follows at once from Whit- 
worth, Choice and, Chance, Proposition xxv, viz. : ‘ The number of ways in which n indifferent 
things can be distributed in t different parcels (blank lots being inadmissible) is 

(n-l)!/(<- 1)1 (»-*)!’.* 

5. The total number of ways in which r x and r 2 elements can be arranged is 

r! 

'w 

We now require to enumerate the number of ways in which r x can be arranged to form t sets 
and r 2 to form t sets. To arrange r x in t sets is equivalent (vide Whitworth) to making t — 1 
breaks in a sequence of r x observations, and this may be done in 

(fj— 1 )!/(*— 1)! (r x — 0* ways, 

and similarly for r 2 . It is not specified whether + or — should start the sequence, and hence 
the total number of ways in which a sequence r x + r 2 may be arranged in t sets each is 

• 2 (*i-l)»(r, -l)l 

(<-l)!(<-l)!(r I -*j!(r 2 -<)r 

* Sine© I first thought of this method of attack I have found that the distribution of groups as 
given by me in §5 has already been given by W. L. Stevens, Ann. Ettgen., Land 9, 10, and by A. Wald 
and J. Wolfowitz, Ann. Math. Statist. 11, 147. The probability function has been tabled by F. S. Smed 
and C. Eisenhart, Ann. Math. Statist. 14, 66, but it is not in a form that I found suitable for my pur- 
poses. The probability function has been known for many years; what is interesting is the different 
uses to which it has been put. 



F. N. David 


301 


The probability of 2 1 sets will be 

pro, | _ _ i 2fil(r 1 — l)!r,!(r,— 1)! 

1 I i’ " r ! (t — 1 ) ! (t - 1 ) ! (r x - * ) ! (r 2 — t ) ! * 

and the probability of obtaining (2t+ 1) sets will be 

P{2t + 1 1 r lt r a } = P{t | r v t + 1 1 r s } + P{t + 1 1 r lt t | r*} = P{2t | r 1( r t } • 


( 1 ) 


( 2 ) 


Hence given r v r 2 and T from a random sequence of positive and negative signs the 
probability of such a number of sets having arisen through chance may be calculated. 

0. It is desired to use the probability of a given arrangement of signs in order to test a 
given hypothesis represented by a smooth probability law, bearing in mind that, if the given 
hypothesis is not true, then any alternative law is likely to be of a smooth type. Although 
no exact definition of a smooth alternative distribution has been made, it may be stated 
here that smooth, in the sense used by Neyman, will imply that the number of sets of signs 
will be small. For example, if the hypothesis tested is that observations follow a given normal 
curve, whereas in fact they have been drawn from a normal distribution identical with the 
first but with a smaller mean, then the differences between observation and expectation on 
the basis of the hypothesis tested may be expected to give a preponderance of positive 
signs below the sample mean and of negative signs above it; that is to say, if the difference 
in means is sufficient to ofFset the sampling fluctuations we should find a single set of positive 
signs followed by a single set of negative signs. If the true population is a normal curve with 
the same mean but with a larger standard deviation than that specified by the hypothesis 
tested, then there will be a tendency towards a set of positive signs, a set of negative signs, 
followed by a set of positive signs, although sampling fluctuations may not leave such a 
clear-cut answer. The more complex the alternative hypothesis the less chance there will be 
of detecting it. 

7. With this objective in view it is proposed to take T, the number of sets of signs, as 
the test criterion, rejecting the hypothesis tested whenever, for a given r x and r 2 , T is excep- 
tionally small. This we do on the grounds that the existence of very few sets of signs suggests 
that the differences between observed and expected frequencies are not due to chance sam- 
pling fluctuations but to some systematic departure of the true probability law (assumed 
smooth) from hypothesis. In following this procedure we should reject the hypothesis if 

P{T^T 0 }= S 

T-1 


where T 0 is the observed value of T and e the significance level selected as appropriate. 
Exact probabilities are given in Table 1, and the application of the test is immediate." 1 
There seems to be no reason why the test should not be applicable to both grouped and 
ungrouped observations, although the formulation of the hypothesis tested may be some- 
what different in the two cases. Consider a sample which has been supposedly randomly 


* An assumption implicit in the test would appear to be that for each x* cell there is an equal chance 
of obtaining a positive or a negative deviation, that is, that there are sufficient numbers in each cell 
for the binomial to be closely approximated to by a normal curve. An extensive series of random sam- 
pling experiments has shown, however, that the divergence between theory and practice is not sig- 
nificant even when the probability of obtaining a positive is four times that of obtaining a negative. 
Henoe while strictly the expectation in each cell of x* should be 10 or over, it would seem that for 
practical purposes that the T test may be applied in all cases where the application of the x* test is 
permissible. 














F. N. David 


303 


drawn from some population. Let the elements of the sample in order of drawing be 
x v x t , We may use the T criterion to test the hypothesis of randomness, in the fol- 

lowing way. If u x is the smallest value observed in the sample and u n the largest, then if we 
exclude the trivial oase when all the x’s are equal it is easy to show that u 1 <x< u„, where 
1 » 

x = - V. z t . If wo now consider the deviations 
n i 

x i — x — 3xt for i= 1,2, ...,«, 

there will be a series Sx t ,Sx a , ...,Sx n , some of which quantities will be positive and some 
negative. The application of the T test is immediate, the admissible alternate hypotheses 
being that if the drawing of the sample is not at random then bias of the smooth kind is 
present. 

8. As an illustration consider the following two cases: 


Case 1. 

Expected frequency 

10 

25 

35 

75 

155 

155 

75 

35 

25 

10 


Observation 

12 

29 

45 

81 

160 

145 

69 

31 

20 

8 


Deviation 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

— 

Case 11. 

Expected frequency 

10 

25 

35 

75 

155 

155 

75 

35 

25 

10 


Observation 

12 

23 

45 

66 

161 

160 

69 

36 

20 

8 


Deviation 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

- 


In the first case y 2 = 6*94 and in the second y 2 = 6-80; in neither case would the hypothesis 
be rejected as inadequate by using the x 2 criterion. The T criterion does, however, bring 
out the essential difference: 

Case I. r, = 5, r 2 = 5, T 0 - 2 and P{T T 0 } = . 

Case II. r, = 5, r 2 — 5, T 0 = 8 and P{T < T 0 j = f£§. 

Using the T test we should be inclined to reject the first hypothesis in favour of a smooth 
alternative, while for the second case we should be inclined to agree with the conclusion 
drawn from the y 2 test that the observational material is adequately described. 

9. Sampling material is available whereby the theoretical distribution of T may be tested 
in practice. Neyman & Pearson (1928) took 208 samples, each of size 200, from a population 
of eight groups described by the cubic curve 

y = 25 + *g-x - y£gx 3 . 

The expectation in each cell for a sample of this size was calculated and the x 2 criterion found 
for each of the 208 samples. The writer was given access to these calculations and was able 
to find the sampling distribution of T from the material. The results of this sampling experi- 
ment and the theoretical distribution of T from relations (1) and (2) are given in Table 2. 

The agreement between theory and practice would seem to be reasonably good, and in 
the cases (4, 4) and (5, 3) the values of x 2 , calculated to test the discrepancy between theory 
and practice, were not greater than might be attributable to sampling fluctuations. It was 
not thought worth while to calculate X 2 for (6, 2) and (7, 1). A second sampling experiment 
in which samples of size 360 were drawn from a normal population of fifteen groups lent 
further support to the reasonableness of the theoretical distribution. 

10. The T criterion will be a useful supplementary criterion to the y 2 , but because it 
takes account solely of the sign of a distribution and not of its magnitude it will probably 
only be useful when used in conjunction with y*. A test of significance which could combine 
both the probability levels of T and x 2 would undoubtedly be more useful, and we may 



304 A x* ‘ smooth ’ test for goodness of fit 

therefore consider how this might be done. Unless the exact degree of dependence whioh 
exists between two variables is known it is usually only possible to obtain their joint dis- 
tribution if they are independent. It would appear reasonable, both on theoretical grounds 
and from sampling experiments, to assume that T and y* are independent, or, if the assump- 
tions underlying both tests are not exactly fulfilled, to assume that the degree of dependence 
between them is at most small. 


Table 2. Comparison of theoretical distribution of T icith 
that derived from a sampling experiment 
(4 positive, 4 negative) 


T — number of sets 

2 

3 

4 

. 5 

6 

7 

8 

Total 

Sampling 

3 

5 

20 

25 

28 

5 

7 

93 

Theory 

2*7 

8-0 

23-9 

23*9 

23-9 

80 

2-7 

93 


(5 positive, 3 negative) or (3 positive, 5 negative) 


T = number of sets 

2 

3 

4 

5 

6 

7 

Total 

Sampling 

2 

8 

32 

30 

20 

10 

102 

Theory 

3-6 

10-9 

29-1 

29-1 

21-9 

7-3 

102 


(6 positive, 2 negative) or (2 positive, 6 negative) 


T = number of sets 

2 

3 

4 

5 

Total 

Sampling 

1 

3 — 

5 

9 

Theory 

0-6 

1-9 

3-2 

3-2 

9 


(7 positive, 1 negative) or (1 positive, 7 negative) 


T = number of sets 

2 

3 

Total 

Sampling 



2 

2 

Theory 

0-5 

1-5 

2 


11. We shall begin by demonstrating that as far as mathematics are concerned the T 
and x 2 criteria are completely independent.* For simplicity of argument let us consider the 
case of three groups only. The sample may then be represented by a point (n x , n 2 , n 8 ) in 
three-dimensioned space, with axes of reference On v On 2 , On s , and the expected population 
values by a point (m v m 2 ,m 2 ) in the same space. Since 

n x -f ?i 2 + n s = m x -f ra 2 + m 3 = N, 

* This method of approach was suggested to me by Andrew Gleason of Harvard University at a 
seminar given at the Statistical Laboratory, University of California, at Berkeley. 






F. N. David 


305 


these points are constrained to lie in a plane. Fig. 1 shows this plane for the particular oase 
N = 16; m 1 ~ 4, m, = 8, m a = 4. Since no frequency can be negative, possible sample points 
must be within an equilateral triangle lying in this plane, the chance of occurrence associated 
with a point being the multinomial term 

N ! /mA"! /m 3 \ n * 

njnjnj. W \n) \n) ' 

When using the y* test the mathematical approximation consists in substituting for this 
term an expression proportional to e - **’, in regarding this last as a continuous function, and 


l l 



Fig. 1. Graphical illustration of the x l contours and the change in signs of the 8ri s. 
rij , n t and n, denote the points of intersection of the On ly On t , On^ axes with 
the plane n x + n % + n 8 = N. According to the approximation, the chance equals 
a of obtaining a sample point lying outside the elliptic contour on which x — Xa • 

in taking as a measure of goodness of fit the integral of this expression outside the ellipse 
which passes through the sample point and on which x 2 is constant. For the case of three 
groups this integral itself assumes the simple form Three such elliptic contours are 
shown in the diagram. 

Planes through (m 1 ,m 2 ,m 3 ) parallel to the co-ordinate planes n 1 Ow 2 , w 2 0w 8 , n s On v will 
intersect the sample plane 


tti + Wa-f n 3 = N 







306 


A x 2 'smooth' test for goodness of fit 

in three straight lines. As shown in the diagram, these lines divide the sample plane into six 
sectors, and for all sample points within a sector the signs of the differences Sn t = n i — m i 
will remain unchanged. Any test based solely on runs of signs will oonsist in taking one or 
more of these sectors as critical regions and rejecting the hypothesis tested when the sample 
point falls therein. It is clear that if we use the mathematical approximation, the distribution 
of % * is the same within each sector; similarly, that the chance of a sample showing a given 
combination of signs is the same on each ellipse along which x 1 is constant. Thus under the 
assumptions made regarding the distribution of y 2 . the T and x 2 criteria are completely 
independent. 

In this case of three groups T can only assume values of two or three and the former value 
would not be judged significant, but the argument will follow exactly similar lines in the 
case of many groups. The number of sectors will be in general 2(2 r a + 1) if r 1 >r 2 and 2(2r s ) 
if r x = r 2 , and they will be bounded by primes passing through the population point. 

12. While the distributions of T and x 2 are independent for this mathematical model 
they are unlikely to be exactly so when we go back to the true multinomial density dis- 
tribution, because the sample space is neither continuous nor infinite. The model, in fact, 
becomes inaccurate if m v m 2 or m 3 are very small. For example, it is seen in Fig. 1 that while 
the 1 % ellipse (x = Xo-oi) ^ es completely within the triangular space for the sectors with 

signs -I 1- and h , it lies completely without the space for the sector + H — and 

partly without for the other sectors. It has been thought worth while therefore to test 
whether the two criteria are independent in practice, and to this end the same material 
previously described has been utilized. Tables 3 and 4 give the distribution of mean x 2 
for different values of P{T) and the distribution of mean P{T) for grouped values of x 2 - 
There is little evidence in these figures to show that P{T} and x 2 (and therefore P{x 2 }) are 
related. The figures therefore lend support to the geometrical argument and indicate that 
the approximations involved in x 2 , both from the small sample and the fact that the sample 
space is not infinite, do not invalidate the mathematical result. 

13. In order to combine the x 2 and T tests of significance it will be necessary to develop 
a theory for the combination of two tests of significance when one criterion is a continuous 
and the other a discontinuous variable. R. A. Fisher has set out the test for the combination 
of tests of significance from a number of independent continuous variables. The keystone 
of the test is the recognition of the fact that if Z is a continuous variable, then z, where 

z = J p(Z)dZ, 

is also a continuous variable equally likely to have any value between 0 and 1 ; we shall 
describe z as being distributed rectangularly. Twice the logarithm of the product of two such 
z’s, say z x and z 2 , where z x and z 2 follow from two independent tests of significance can be 
shown to be distributed as x 2 with four degrees of freedom. Consider a discontinuous variable 
X which may take values -X\,X 2 , ...,X 8 and which has an elementary probability law 
PpT = Xj} = p } , where 0<p.< 1 for j = 1 , 2, ...,s 

and E pt — 1 . 

i - 1 

If a new variable, x, is defined as taking values x v x 2 , ... ,x s , where 



F. N. David 


307 


Table 3. Mean y? for different values of P{T) 


r v r, or r v r t 


4,4 

6,3 

4,4 

5, 3 

6, 2 

4,4 

6,3 

4,4 

6,2 

No. of obs. on which 
mean is based 

26 

6 

20 

28 

30 


26 

32 

20 

3 

P{T) 

1-00 

0*97 

0-93 

0-80 

0-71 

0-64 

0-03 

043 

0*37 

0*29 

Mean y* 

7-19 

0-04 

6*87 

6*43 

6-98 


7*41 

7-32 

0-82 

4*86 


I r l9 r, or r § , r x 

No. of obe. on which 
mean is baaed 

7,1 

5,3 

8 

4,4 

5 

«,2 

1 

6, 3 and 4, 4 

5 

P{ T) 

Mean x 8 

i 

0-25 

0*14 

010 

0*11 

6-53 

0-07 

0-95 

0*04 and 0*03 
8-0G 


Table 4. Mean P{T}for grouped y 2 


* ! 

1 

0*0-10 

1 *0-2*0 

2-0-3-0 

30-4*0 

4*0-50 

5*0-00 

0*0-70 

70-8*0 

80-9*0 

90-100 

No. of obs. on which 

1 

7 

14 

27 

32 

29 

21 

14 

u 

15 

mean is based 











Mean P{T) 

0*71 

0*73 

0*09 

0*00 

0*68 

0*00 

0*05 

0*57 

0-67 

0*64 


X 1 

10 0-1 10 

110-120 

120-130 

13*0-140 

140-150 

150-100 

160-170 

17*0-18*0 

No. of obs. on which 

10 

10 

5 

4 

2 

l 

2 

1 

mean is based 









Mean P{T] 

0*74 

0-05 

0*74 

0*58 

0*82 

1*00 

0*03 

0*03 


then x k may only take values between 0 and 1 for k = 1,2 «. It is required to find the 

joint probability law of the product of two independent variables x and z, where x and z 
are as defined above. It will be noted that the elementary probability law of x will be 

P{x = x j } = p J (j = 1,2,...,*). 

Hence when x = x } (the probability of which is pf), the product xz will be distributed rect- 
angularly between 0 and x f on a proportion Pj of occasions. Tt follows that xz has a probability 
distribution which has points of discontinuity at x l ,x 2 , ...,x s , that it is distributed rect- 
angularly between these points of discontinuity, and that 

P{0<xz<x l } = p 1 2 P{x l <xz<x^- p t 2 y- 
f-l x ] J-s x i 

8 Pi 

P{x k _ 1 <xz<x k } = p k 2 „ • 

i-fe *y 


Generally 




308 A x 2 1 smooth' test for goodness of fit 

14. If we now apply this theory to the combination of the tests of significance of T and 
X i , it is seen that we must consider the product of P{x*} and P{T). is a continuous 
variable and , +00 

2 = . P(X 2 )<l(X 2 ) = P{X 2 >Xo} = P{x*} 

J Xo 

is distributed rectangularly between 0 and 1, and 

x = T ± P{T | r 1 r 2 } = P{T < T 0 } = P{T } 

2 

is a discontinuous variable taking known values. The probability integral of xz is thus known 
from theory and F^ or F^ can be found to satisfy the relation 

P{0 <xz< FJ = e. 

These probability levels are given in Table 5. The procedure for the joint test of significance 
will be: 

(i) calculate P{T} as described in § 7; 

(ii) calculate P{)f} in the usual way. The degrees of freedom will be the number of groups 
minus one; 

(iii) multiply P{T } and P{x 2 } together and refer to Table 5 to judge the significance of 
the product. 


Table 5. Values of Y^ and where P{P(x 2 ) P( T) < FJ ~ e 

This table may be used to judge the significance of the joint distribution 
of the T criterion and any other continuous criterion. 


r 

ri 

r 2 

J'o-o. 

r».„, 

r 


r 2 


^ 0-01 

5 

4 

1 

00312 * 

00062 * 

11 

10 

1 

0 - 0276 + 

0 - 0055 + 


3 

2 

00213 

0-0043 


9 

2 

0-0171 

0-0034 







8 

3 

0-0144 

0-0028 

0 

5 

1 

00300 

0-0000 


7 

4 

0-0144 

0 - 0026 + 


4 

2 

00211 

0-0042 


0 

5 

0-0140 

0-0024 


3 

3 

00195 

0-0039 











12 

11 

1 

0-0273 

0 - 0055 - 

7 

6 

1 

0-0292 

0-0058 


10 

2 

0-0174 

0 - 0035 - 


5 

2 

0-0197 

0 - 0039 * 


9 

3 

0-0149 

0-0027 


4 

3 

00174 

0-0035 


8 

4 

0-0142 

0-0024 







7 

5 

0-0135 

0*0022 

8 

7 

1 

0-0280 

0-0057 


0 

0 

0-0131 

0-0021 


0 

2 

0-0188 

0-0038 







5 

3 

0-0100 

0-0032 

13 

12 

1 

0-0271 

0-0054 


4 

4 

0-0153 

0-0031 


11 

2 

0 - 0105 + 

0-0033 







10 

3 

0-0151 

0-0020 

9 

8 

1 

0-0281 

0-0050 


9 

4 

0-0138 

0-0023 


7 

2 

0-0180 

0-0030 


8 

5 

0-0137 

0*0022 


6 

3 

0-0153 

0-0031 


7 

0 

0-0138 

0 - 0022 6 


5 

4 

00140 

0-0028 











14 

13 

1 

0-0269 1 

0-0054 

10 

9 

1 

0-0278 

0-0050 


12 

2 

0-0163 

0-0033 


8 

2 

00175 

0-0035 


11 

3 

0-0151 

0 - 0025 + 


7 

3 

0-0143 

0-0029 


10 

4 

0 - 0135 ~ 

0 - 0022 s 


6 

4 

0-0143 

0-0020 


9 

5 

0-0138 

00023 


5 

5 

0-0143 

0-0025 


8 

0 

0-0130 

0-0022 

1 




! 


7 

7 

00134 

0*0022 


F. N. David 


309 


15. The application of the joint test of significance may be illustrated by means of an 
example. A sample of 360 observations is available. This sample has actually been randomly 
drawn from a normal population of which the mean is zero and the standard deviation unity. 
The figures are given in Table 6. Calculations give y 2 = 21*1 and P{y 2 } = 0*10. Judging 
by the x 2 alone we should say probably that there is nothing out of the ordinary in the 
deviations of the sample from the expected values. The number of signs is 15, of which 9 
are positive and 6 negative, and these are arranged in six sets. Making the appropriate 
calculations, we have 

P{6 sets | 9 positive; 6 negative} = = 0-175. 

The arrangement of signs will therefore be judged as acceptable. The joint significance of a 
P{y 2 } - an d a P{P} = 0-175 is found, by evaluating the joint distribution, to be 0-066. 


Table 6. Sample values. Observed and expected 


Central values 

-21 

anti under 

- 1-8 

-1*5 

— 1*2 

-0*9 

-0*6 

-0*3 

0*0 

Observation 

12 

10 

18 

26 

23 

42 

43 

49 

Expectation 

9*3 

8-6 

14*0+ 

210+ 

28*7 

35*9 

410+ 

43*0- 

Deviation 

4* 2*7 

4-1*4 

4 4*0 

4-50 

-5-7 

+ 61 

4-2*0 

4-6*0 










Central values 

+ 0-3 

4-0*6 

4-0-9 

4 1*2 

4-1*5 

4-1*8 

4-2*1 
and over 

Total 

Observation 

35 

28 

20 

26 

20 

3 

5 

360 

Expectation 

410+ 

35*9 

1 

28*7 

21*0 

14*0 

8*6 

9-3 

360 

Deviation 

-60 

-7*9 

-8*7 

4-5*0 

4-6*0 

-5*6 

-4*3 

0 


16. A study of the basic table (Table 1) of the function T will show that P{T } is not a 
very sensitive criterion with which to judge the randomness of a sequence of signs unless 
the number of groups under consideration is very large. For example, if there are 10 signs, 
5 of which are positive and 5 of which are negative, the probability of getting two sets of 
signs is 0-008. Thus the test would show, and rightly, that the chance of such an arrangement 
is small, but this fact would undoubtedly be recognized by a skilled computer without the 
use of a test at all. In the case of 10 signs the probability of three groups or less is 0-040, 
and this would possibly be judged non-significant . Again, let us consider an extreme case say, 
10 signs, 9 of which are positive and 1 negative. The T criterion does not concern itself with 
the fact that the numbers 9 and 1 are exceptional, it is merely concerned with deciding whether 
their arrangement is exceptional given the 9 and 1. Table I shows that neither possible 
arrangement would be considered out of the ordinary. It is these points of weakness which 
show that the criterion T is not of great utility except in combination with y 2 . For, if we 
consider the 9 positive, 1 negative case, common sense tells us that the y 2 criterion in such 



310 A x 2 1 smooth ’ test for goodness of fit 

a case would possibly be significant. Nine positive deviations have to be balanced by a 
single negative deviation, and this last is therefore likely to be big. This does not influence T\ 
neither will the contribution of T to the joint criterion be of muoh weight. This iB as it should 
be, for it is difficult to see how one can postulate a smooth alternative for 9 positive, 1 nega- 
tive, two sets, and not also for 9 positive, 1 negative, three sets. Generally, however, we 
shall not meet such extreme cases in practioe. One way of overcoming this weakness of the 
test would be to consider the probability of obtaining r x positive and r, negative signs together 
with the probability of obtaining T sets of alternate positive and negative signs given and r 2 . 
This is simple enough when considering just a sequence of alternatives, as I have shown 
elsewhere, but it is not easy to fit these results to the y 2 problem, nor, when this is possible, 
will the choice of a critical region be straightforward. However, the results of sampling 
experiments will be utilized to throw light on these points and it is hoped to disouss them, 
with other questions arising, in a further publication. 

17. It is possible that there are other criteria, depending on the arrangement of positive 
and negative signs, which will be more sensitive than the T criterion chosen. For example, 
it is easy to calculate, given r x and r 2 , the probability that the largest set is composed of a 
sequence of r' positive signs, and there are many other possibilities which might be con- 
sidered. It would appear that any criterion based on sign sequences can be shown to be 
independent of y 2 by means of geometrical argument, and it will be necessary therefore to 
consider the power of these different sign tests when referred to a specified set of alternate 
hypotheses. 

18. The main objection to the two criteria, T and P{y 2 }.P{T}, that I have proposed in 
this note is the one which was mentioned earlier; they are only applicable to the case where 
there is just one restriction on y 2 , i.e. when the totals of expected and observed frequencies 
have been made to agree. It is possible to work out a slightly different form of the T criterion 
for each additional restriction which is put on y 2 , and this has been done. It is preferable, 
however, to delay publication until the results of an extensive sampling experiment are 
complete in order to verify whether such theoretical assumptions as have been made are 
reasonable. 

REFERENCES 

Neyman, J. (1937). Skand. Aktuar. Tidskr. 20, 149-99. 

Neyman, J. & Peabson, E. S. (1928). Biometrika, 20 A, 263-94. 



[ 311 ] 


AN EXACT TEST FOR THE EQUALITY OF VARIANCES* 

By R. L. PLACKETT, M.A. 

Introduction 

The problem of testing the equality of variances and covariances in normal distributions 
is one which has received considerable attention; we have compiled a bibliography of some 
sixty papers, and shall issue a survey of these in due course; only papers vital to our discussion 
will be considered here. A precise instance of the type of situation we are considering is as 
follows: measurements of height, span and tibia length are made on each of 20 Englishmen, 
20 Scotsmen, 20 Welshmen and 20 Irishmen; it is required to know if the covariance matrix 
of the three characteristics is the same for each of the four nationalities. Nothing is known 
or assumed about the mean values of these characteristics in the four populations con- 
sidered, nor are we interested in testing any hypothesis concerning the means, although 
such a hypothesis may be the object of further investigations which assume that the four 
covariance matrices are the same; this latter assumption is inevitably made in multivariate 
analysis of variance. 

Wilks (1932) has already given the moments of the distribution of his criterion for testing 
the equality of several covariance matrices (on the hypothesis that the matrices are in fact 
equal) and Bishop (1939) put this criterion into an approximate workable shape. The test 
criterion given here differs from that of Wilks and has the advantage when one or two cor- 
related characteristics are being measured (height or height and span, for example) that its 
distribution is exactly known whatever the number of populations. Nair (1939) did, it is 
true, give the exact distribution of the Neyman & Pearson (1931) L x criterion for one mea- 
sured characteristic; and the exact distribution for two characteristics of Wilks’s generaliza- 
tion of their criterion; but the form in which the distribution was obtained is very involved. 
It is interesting to notice that from our standpoint the problem of testing the equality of 
several variances (i.e. the case of one measured characteristic) is, as will appear, brought 
within the framework of multiple correlation theory. In the general case of more than two 
characteristics the moments of the distribution of our criterion, like those of Wilks, are 
available. 

Outline of method 

in the usual terminology we consider k p-variate normal distributions and are concerned 
with testing the hypothesis that the corresponding variances and covariances are all equal. 
The method we employ to test this hypothesis is essentially that which has been in use in 
analysis of variance since its origination by Fisher; to test the equality of a set of k quantities 
we test whether (k— 1) orthogonal linear functions of the quantities are each zero. To illu- 
strate the application of this principle in the present instance take the particular case p = 1 , 
i.e. we wish to test the equality of the variances in k univariate normal distributions. If 
a typical observation from the Zth distribution is tj (l = 1,2 ,...,&), form k mutually ortho- 
gonal linear functions of the ^ such that one is # 

u = t-y + 4* • • • + Zfc. 


Biometrika 34 


• Communication from the National Physical Laboratory. 


21 



312 


An exact test for the equality of variances 

If the (k — 1 ) covariances of u and each of the other linear functions are all zero then the 
variances of the k distributions must all be equal; this condition may be expressed by saying 
that the multiple correlation coefficient of u on the other linear functions is zero. Further, 
if there are n sample values of u then the size of sample drawn from each distribution must 
also be n at least, and if no observations are to be discarded the size of each sample must be 
n exactly. Thus, although it is not a condition of the problem that the sizes of samples drawn 
from the k distributions must -all be equal, it is a condition of our solution. 

The extension of the foregoing principle to p > 1 is straightforward and is considered in 
detail in the next section; the problem then becomes that of testing the independence of two 
groups of variates, the first of size p, i.e. p expressions of the form w ; and the second of size 
p(k — 1 ) comprising all the other orthogonal linear functions. This problem has been treated 
by Wilks ( 1 935, 1943) and the relevant distribution is expressible as an incomplete /9-function 
whenp = 1 and 2 (for all k) \ an exact distribution is also known when p = 3 and 4 for k = 2. 
Finally, since when p — 1 the criterion has the form of a multiple correlation coefficient, the 
power of the test in this instance can be calculated by virtue of the work of Fisher (1928). 


Discussion of the test 

A sample of n observations is drawn from each of the k p -variate normal distributions of 
which the 1th has the covariance matrix V l {j (l = 1,2, ..., k\ i,j =1,2, ...,p). It is required 
to test the hypothesis that 

v ij= v ij (J,m = 1,2 k). (1) 

The population means do not enter into the hypothesis and have arbitrary unknown values. 
Where i, j , Z, m appear henceforth they will be understood to range over the values given 
above unless otherwise stated. The observations may be written in the form of an n x kp 
matrix X such that all those on the ith variate in the Zth distribution are in column (i — 1 ) k + Z. 
The ath observation in this column (a = 1 , 2, . . . , n) is denoted by ; the order of the elements 
in a column is assumed to be random. If this is doubted the observations should be randomly 
rearranged. 

We must emphasize here that the sample value of the criterion to be used to test (1) 
depends on this order, and there is thus, in a sense, a correspondence between and 
although these two quantities are, of course, uncorrelated when Z + m. Most tests of a 
hypothesis specifying nothing about the order in which observations are made or written 
down are themselves independent of it; ours is not, and different computers with the same 
data might well come to different conclusions although this does not affect the validity of 
the test, the significance level being overall what it should be. There is probably some loss 
of power which can, however, be offset by imbuing a with a certain physical meaning; but 
we shall not discuss this question here. A criterion for testing normality depending on the 
order of arrangement of observations has been suggested by R. C. Geary (1935, pp. 316-17). 


Let now 



( 2 ) 


and let the corresponding nxkp matrix be Z. If 0 = Z'Z, where a prime is used to denote 
the transpose of a matrix, then, apart from a factor n, 0 is the matrix of sample variances 
and covariances of all variables. We further define S(k , p) as the sum of all (k 2p ) signed minors 




.,mp 



313 


R. L. Plackett 

formed by rowB l v l t l p and oolumns m„ m t , m p of O, where 

(* — 1 ) k < l it m i < ik. (3) 

8(k, p) is similarly defined for the matrix (5 = O l (we shall use this notation for the inverses 
of matrioes throughout). • 

We now proceed to prove the following 

Theorem: W(k,p) = k ip /8(k,p) 8(k, p) 

is distributed like Wilks’s statistic for testing the hypothesis that two groups of variates 
of sizes p and p(k- 1), known to have been drawn from a (^p) -variate normal distribution, 
are mutually independent (Wilks, 1935, 1943). If the groups are in fact mutually independent 
then (1) is true. 

Proof. Introduce a kxk orthogonal matrix B, the elements of whose first column are all 
equal (to ± 1 jyjk) but which is otherwise quite arbitrary. Put 

r-{i-\)k + l, u = (m-l)p+j, (4) 

and form a kp x kp matrix A such that 

a ru ~ ^ij^hiv (®) 

where S tJ — l (i = j), otherwise O. Clearly A is also orthogonal. For example, suppose 
k = 4, p = 2. Apart from a factor of ± £ multiplying each element, let 

B = I 1 1 1 . 

1 1-1-1 
1-1 1-1 
1-1-1 1 

Then A = 10 1 0 1 0 1 0 . 

10 1 0-1 0-1 0 

10-1 0 I 0-1 0 

10-1 0-1 0 1 0 

0 10 10 10 1 

oi o i o-i o-i 

01 0-1 0 1 0-1 

0 1 0-1 0-1 0 1 

When p = 1, A = B. Let 


D = XA , Y = ZA, 0 = Y'Y — A'OA. (6) 

Putting s = (j—l)k + m, t — (l—\)p + i, (7) 

and defining t' = (l—l)p + i, n' = (m—\)p+j (l,m = 2, 3, ...,k), (8) 

we have £(y rs ) = S tm (n - 1 ) V l ip (9) 

so that <^(Ci„) = (n— 1)S 0®) 

Hence when (f(o <u .) = 0 (11) 

equations (1) are satisfied, because for fixed i and j equations (10) can be solved and yield 

(n- 1)FJ, -/(<*). (12) 


ai-a 



314 


An exact test for the equality of variances 


Denote a typical element of the fth column of D by d t . Then equations (11) are satisfied if 
and only if d t and d u . are mutually independent. 

A criterion for testing (11), obtained by likelihood-ratio methods, has been given by Wilks 


(1935, 1943). This is 


W(k,p) = 


Kl 


»ti I I 


(13) 


and is sometimes called the vector alienation coefficient. Let CM be the pth compound of C 
(Aitken, 1939, p. 90), i.e. the matrix of all p xp minors of C; and Cm the pth compound of 
C — C- 1 (since CM is the inverse of C (p) our notation is consistent). Then 

W(k,p) = l/cfc>c# (14) 


by an application of Jacobi’s theorem on the minors of the adjugate (Aitken, 1939, p. 97). 
Now by the Binet-Cauchy theorem (Aitken, 1939, p. 93), 

«= (A ')<**> QM AM, <>> = (A ')<*» &*) am. ( 1 5 ) 


Consider the elements in the first row of (A')^. The first p rows of A' are of the form 


11 . 

. 1 

00. 

.0 

00. 

. 0 . 

. 00. 

.0 

00. 

.0 

11 . 

. 1 

00. 

.0 . 

. 00. 

.0 

00. 

.0 

00. 

.0 

11 . 

. 1 . 

o 

o 

.0 

00. 

.0 

00. 

..0 

00. 

.0 . 

. 11 . 

.. 1 


apart from the factor ± 1 /Jk multiplying each element. Therefore the only non-zero elements 
in the first row of (A')M are those formed by taking one column from each of the p blocks 

of k columns into which the first p rows of A' may be divided. All the non-zero elements 

equal k~ ip . Then from (15) 

4? = S(k, p) k~P, cft> = S(k, p) k~P, ( 1 0) 

so finally W(k,p) = k 2p /S(k,p) S(k,p). (17) 

This completes the proof. 

Case of p — 1 

Here W(k, 1 ) = k*IS(k, l)S(k,l), 

where S(k, 1), S(k, 1) are the sums of all elements of O, 0~ l respectively. If (1) is true, 
W (k, 1), the true value of W(k, 1), is unity. Define 

W(k, 1) = 1-f? 2 and W(ifc, 1) = 1 -R 8 , (18) 

so that if (1) is true, R = 0. The distribution of R 2 = 1 — W(k, 1) when R = 0 is, as Wilks 
pointed out, well known, being that of the multiple correlation coefficient (of d 1 on d z , d a , . . . , d k ) ; 
if in the usual notation • 

I x (a,b) = [B(a, 6)]-i JV-i(l -xf~'dx, (19) 

then the cumulative distribution function of x — R 2 is I x (k — \,n — k), values near 1 being 
significant; that of x = W{k, 1) being I x (n — k,k— 1) with small values significant. Tables 
in convenient form have been calculated by Thompson (1941); otherwise we can convert 
to the variance-ratio F by 


F = (n-k)(l-W)l(k-l)W. 


( 20 ) 



B. L. Plaokett 315 

It is dear that n must exceed k; for p variates, n exceeds pk in order that 0 may be non- 
singular. 

If the matrix A is defined instead as a kp x bp orthogonal matrix, the elements of whose 
first column are all equal (cf. equation (5)), the problem is effectively reduced to the case 
p = 1 whatever the value of p, and we can test exaotly the somewhat indefinite hypotheses 



nr+ n.+-+n,> = n t + ns+ + ns- 

(21) 

This may be applied in the following manner, for take k = p = 2 and obtain 


* 

Hi i + n* = + n 2 = v n + n* = n, + n 8 . 

(22) 

Thus 

Hr = n* and V 2 n = V* % . 

(23) 

If it is assumed 

nr - nr. 

(24) 

then 

li 

ea 

Hrt 

(26) 

and conversely. 

Case of p = 2 



The distribution of fF(&, 2) has been given by Wilks (1935). If* = ^[W(k, 2)], the cumula- 
tive distribution function of x is 

I x (n-2k,2k-2). (26) 


Small values of x are significant and n must exceed 2k. 

Case of p^ 3 

For k — 2, p = 3 and 4, the exact distributions are again known and have been given by 
Wilks in equations (36) and (37) respectively of his 1936 paper. The expressions are rather 
complicated and we have not reproduced them here. For other values of k and p the moments 
of W(k,p) are available; while more recently Wald & Brookner (1941) have obtained the 
distribution in the form of an infinite series, calculating numerical values for the coefficients 
in certain instances. 

For p> 1, (17) becomes rather intractable as a means of calculating W(k,p). Indeed, 
for k — 2 and p — 4 it is necessary 

(i) to calculate 36 sample variances and covariances, 

(ii) find the inverse of an 8 x 8 matrix, 

(iii) calculate 612 4 x 4 determinants, 

and it is dearly better to reintroduce the matrix A in some appropriate numerical form, 
calculate Y = ZA and C = Y'Y, and find W(2, 4) from (13), a process which involves the 
evaluation of an 8 x 8 and two 4x4 determinants. 

POWEE OF THE TEST WHEN p= 1 
From (17) the true value of R 2 is in general given by 

1 - R 2 = * a /[(s nr) (s 1/nr)] . (27) 

and thus the test will have equal power for all values of the variances such that the product 
of their sum and the sum of their reciprocals is constant. Consequently 1 — W(fc, 1) is dis- 
tributed like the multiple-correlation coefficient in samples from a population where the 



316 


An exact test for the equality of variances 


true value is given by (27). The probability density function of this distribution has been 
deduced by Fisher (1928) and can be integrated to give a finite series when (n—k) is even. 
We find easily when k = 2 that in the V\ v Ffj quarter-plane the equipotentials are pairs 

° fhne8 = and aV\ x = V\ v (28) 

where a = ( 1 + R)/( 1 - R). (29) 

For k> 2 the equipotential surfaces in k dimensions are cones through tfre origin situated 
symmetrically with regard to the co-ordinate primes. 

Reverting to k = 2 three methods are available for testing the hypothesis that Vh = V\ x . 

(i) Fisher’s s or F = exp (2z) 

= 9ul922- ( 3 °) 

(ii) the L l criterion introduced by Pearson & Neyman (1930) and later extended to 
k> 2 (Neyman & Pearson, 1931). 

In the instance we are considering, i.e. equal sample sizes from both populations, 

L x = 2(^ u j aa )*/(^n + 9 , 2a) (31) 

= 2F*/(1 + F). (31a) 

(iii) 1F(2, 1 ) = 4[j/ n <7 aa — (< 7 ia ) 2 ]/[(</ u -)- gr aa ) 2 — (2<7 la ) 2 ].* (32) 

Thus tests (i) and (ii) are exactly equivalent, as is known, the optimum critical region being 
that corresponding to equal tails of the /'’-distribution. Criterion (iii) is that obtained by 
Morgan (1939) and Pitman (1939), appearing as equation (12) in Morgan’s paper, to test 
that the variances in a normal bivarate population are equal. Morgan has compared the 
powers of tests (i) and (iii) for n = 1 2, 25 and 1 00 at a significance level of O' 1 0 and for these 
sample sizes it appears that the tests are effectively of equal power. 

When n is large and, consequently, the two populations being independent, ((/ 12 ) 2 /</ n ;/ 2 2 
is converging in probability to zero, 

W(2, \)~ L\. (33) 




n con- 


The cumulative distribution functions of criteria (ii) and (iii) are respectively Z, 

(Nayer, 1936) (x = L\) and ( x = If (2, 1)). Generally, W(k, 1) for large 

verges in probability to the harmonic mean of the sample variances divided by their arith- 
metic mean; L x (for equal sample sizes) is exactly equal to the geometric mean divided by 
the arithmetic mean. 


Example of the use of the test for a case with & = 4, p - 1 

It is not easy to calculate W(k, 1) from equation (17) if k > 3. Indeed, the main value of (17) 
lies in showing the form of solution, and in establishing that this is independent of the 
particular orthogonal transformations used. In the following example, therefore, orthogonal 
transformations are made at once and the multiple correlation coefficient is calculated from 
the numerical data. This procedure is far quicker than that involved in calculating JF(4, 1) 
from (17). 


* See Appendix. 



R. L. Plackett 


317 


Below are given samples of 10 from eaoh of four univariate normal populations: 


x x 

x t 

x a 

*4 


** 

x z 

X 4 

-20 

+ 24 

+ 4 

+ 52 

+ 7 

+ 15 

+ 8 

- 8 

- 1 

+ 18 

+ 9 

-24 

+ 5 

+ 24 

- 1 

+ 56 

-11 

+ 27 

-27 


+ 18 

-12 

+ 1 

-64 


+ 21 

+ 5 

+ 48 

+ 13 

-24 

- 4 

+ 12 

- 4 

-48 

- 3 

+ 48 

- 6 

+ 12 

+ 5 

-12 


which have mean zero and standard deviations respectively 10, 30, 10, 40. Make the fol- 
lowing orthogonal transformation: 

y t = x x + x 2 + x a + x 4 , y 2 = x 1 + x 2 -x 3 -x i , 


and obtain 


y a = X 1 -x i + z a -x t , y 4 = * 3 + * 4 , 


V \ 

y % 

y * 

2/4 

2 /i 

y % 

2/3 

2/4 

+ 60 

-52 

-92 

+ 4 

+ 22 

+ 22 

+ 8 

-24 

+ 2 

+ 32 

+ 14 

-52 

+ 84 

-26 

-76 

+ 38 

-11 

+ 43 

-65 

-11 

-57 

+ 69 

+ 95 

-57 

+ 84 

-22 

-54 

+ 32 

- 3 

-19 

+ 21 

+ 53 

- 7 

-97 

- 7 

+ 95 

- 1 

+ 13 

- 1 

-35 


Form the matrix of sums of squares and cross-products, i.e. C. This is 



1 +18636-1 

- 9646-9 

-18232-9 

+ 7325-1 



- 9646-9 

+ 21784-1 

+ 12018-1 

- 19015-9 



- 18232-9 

+ 12018-1 

+ 28692-1 

- 9445-9 



+ 7325-1 

-19015-9 

- 9445-9 

+ 22008-1 

The matrix of sample correlation coefficients 

is therefore 



1 

-0-4788 

-0-7885 

+ 0-3617 


-0-4788 

1 

+ 0-4807 

-0-8685 


-0-7885 

+ 0-4807 

1 

-0-3759 


+ 0-3617 

-0-8685 

-0-3759 

1 


Hence the multiple correlation coefficient of y x on y t , y 3 , y 4 is given by R 2 = 0 ( 
by the approximation indicated in the last paragraph of the preceding sectit 

the true value obtained from equation (27) with variances in the ratio 1:1 

The upper 10 and 5 % levels of significance, obtained from Thompson’s tables with 
v x = n — k = 6, v a — k — 1 = 3, are respectively 0-622 and 0-704. We find L x = 0-565, the 
5 and 1 % levels obtained from Nayer’s (1936) tables being respectively 0-797 and 0-719, 
so that this test gives a more significant result than the one based on R 2 . The relative merits 
of L x and the test we have provided, which cannot be judged on the results of one example, 
remain a problem to be investigated. 


* l.e. calculated rrom 1 — (harmonic mean of ^(((/(arithmetic mean of g (t ). 











318 


An exact test for the equality of variances 


SUMMARY 

An exact test has been put forward for the equality of variances and oovariances in any 
number k of 1- or 2-variate normal populations; the test is also exact for two 3- or 4-variate 
populations; but is restricted in application to equal sample sizes n from the k populations 
where n exceeds pk, p being the number of variates. The moments of the criterion are avail- 
able for k p - variate populations where the statistic used is equivalent to that employed by 
Wilks (1935) to test the independence of two groups of variates (of sizes p and p(k - 1)), and 
has the same distribution. In the univariate case the power of the test is known as a function 
of one parameter. Comparison with the L r criterion has already been made when p = 1 
and k — 2, the tests being practically the same, and an example worked out of the use of the 
test when p = 1. 

Our thanks are due to E. C. Fieller for drawing our attention to the papers by Morgan and 
Pitman and suggesting that the test given there for the equality of two variances might be 
extended to more than two; also to Prof. E. S. Pearson for pointing out the need of oertain 
explanatory additions. 

The work described above has been carried out as part of the research programme of the 
National Physical Laboratory, and this paper is published by permission of the Director 
of the Laboratory. 


REFERENCES 

Aitken, A. C. (1939). Determinants and Matrices . Oliver and Boyd, Edinburgh. 

Bishop, D. J. (1939). On a comprehensive test of the homogeneity of variances and covariances in 
multivariate problems. Biometrika , 31, 31. 

Fisher, R. A. (1928). The general sampling distribution of the multiple correlation coefficient. Proc . 
Roy . Soc. A, 121, 054. 

Geary, R. C. (1935). The ratio of the mean deviation to the standard deviation os a test of normality. 
Biometrika, 27, 310. 

Morgan, W. A. (1939). A test for the significance of the difference between the two variances in a 
sample from a nonnai bivariate population. Biometrika, 31, 13. 

Nair, U. S. (1939). The application of the moment function in the study of distribution laws in statistics. 
Biometrika, 30, 274. 

Nayer, P. P. N. (1930). An investigation into the application of Neyman and Pearson's L x test, with 
tables of percentage limits. Statist Res . Mem. 1, 38. 

Neyman, J. & Pearson, E. S. (1931). On the problem of k samples. Bull. int. Acad . Cracovie,, S6rie A, 
p. 400. 

Pearson, E. S. & Neyman, J. (1930). On the problem of two samples. Bull . int. Acad . Cracovie, 
S6rie A, p. 73. 

Pitman, E. J. G. (1939). A note on normal correlation. Biometrika, 31, 9. 

Thompson, C. M. (1941). Tables of percentage points of the incomplete beta-function. Biometrika , 
32, 108. 

Wald, A. & Brookner, R. J. (1941). On the distribution of Wilks* statistic for testing the independence 
of several groups of variates. Ann . Math. Statist. 12, 137. 4 

Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika, 24, 471. 

Wilks, S. S. (1935). On the independence of k sets of normally distributed statistical variables. 
Econometrika, 3, 309. 

Wilks, S. S. (1943). Mathematical Statistics. Princeton University Press. 



E. L. Plaokbtt 


319 


APPENDIX 

As an illustration of the algebraic form of W(k, 1) the Editor has suggested to me that it 
might be helpful to show the relation of the general formula (17) to the matrix 0 in this 
simple case when k = 2. Here, using a common notation for a sample mean 


1/11= 2 (*L -*}•)*. 

a** 1 


S (*!«-**•)*. 9i»= S (*L-*i.)(*L -*!.). 

a- 1 a- 1 


°-C;; £]■ 




Whence, using (17), (32) is at once obtained for FF(2, 1). For fc> 2 the full expression for 
S(k, 1) in terms of the fir’s is complicated and the matrix notation becoides essential. 



[ 320 ] 


THE ESTIMATION FROM INDIVIDUAL RECORDS OF THE 
RELATIONSHIP BETWEEN DOSE AND QUANTAL RESPONSE 

By D. J. FINNEY 

Lecturer in the Design and Analysis of Scientific Experiment , University of Oxford 

1. Introduction. 

A type of biometric problem frequently encountered by the statistician is that which 
requires the estimation and study of a relationship between dose and response. ‘Dose* is 
here a general term indicating the magnitude of a stimulus applied to certain test subjects, 
and ‘response’ is a measure of the effect which the stimulus produces on the subjects. When 
the test subjects are living matter, whether plants, animals or bacteria, pieces of tissue or 
single cells, the response to a specified dose is unlikely to be constant in repeated trials, and 
regression methods must be used in the estimation of the relationship. 

In some classes of data, the response is 6 all-or-nothing ’ or quantal, and cannot be measured 
quantitatively. Ordinary regression methods are then no longer applicable; methods based 
on the transformation of the proportion of subjects showing the response at any dose level 
to the normal equivalent deviate (Gaddum, 1933), or to the probit (Bliss, 1934a, 6), however, 
have proved very powerful for simplifying the statistical analysis. In recent years, full 
accounts of the underlying theory of these transformations, and of their application, have 
been published by various authors (see, for example, Bliss, 1935a, b\ Finney, 1947, 1948). 
An additional difficulty sometimes found is that the intensity of the stimulus cannot be 
selected in advance of a test, but can only be measured after the test has taken place; only 
rarely will two or more subjects happen to receive exactly the same dose, and more usually 
the records consist of a list of doses with, for each, a statement of whether a single subject 
receiving that dose responded or not.* For example, in some methods for the testing of 
insecticidal potency, poison bait is offered to individual insects; the dose received by any 
insect cannot be specified in advance, and must instead be measured as the amount of poison 
ingested. 

Data from experiments of this kind do not give empirical values for the proportion of 
subjects responding at each dose level, except in the trivial sense that every dose shows either 
zero or 100 % responding. Nevertheless, as Bliss (1938) has pointed out, the probit method 
can still be applied to estimation of the dose-response relationship. He has given a numerical 
example, though without showing full details of the working, but has admitted that assess- 
ment of the error of estimation presents some theoretical difficulties (Finney, 1947, §43). 
An interesting example of experimental results requiring this type of analysis has recently 
been brought to the notice of the writer by Mr R. W. Gilliatt. These introduce an additional 
complication, since the dose is expressed in terms of two measurements, and a probit plane 
(Finney, 1943) or other bivariate regression function must therefore be estimated. An 
account of the analysis, with computational details, may help those who have encountered 
analogous problems in biological or other investigations. 

* When response does not involve death or serious alteration of the test subject, one subject may be 
used many times; the example discussed in this paper is an instance. The form of the data will be the same, 
though the interpretation may require that tolerance variation between and within subjects be dis- 
tinguished. 



D. J. Finney 321 

2. The data 

Research in human physiology has demonstrated that, under carefully controlled experi- 
mental conditions, a transient reflex vaso-constriction in the skin of the digits may follow 
a single deep breath (Bolton, Carmichael & Sttirup, 1936). Gilliatt (1947) has found that the 
response depends in part on the volume of air taken in by the subject. Plethysmographic 
measurement of the volume changes in a Anger was used to indicate the occurrence of a 
response, but assessment of the degree of vaso-constriction, in order to relate this to the 
inspiratory stimulus, was not practicable. Thus the records obtained for each test show only 



Fig. 1. Contours of doso-response surface for 0*1, 0*26, 0*5, 0-75 and 0*9 frequency of response, 
estimated from three -parameter equation. O no vaso-constriction; # vaso-constriction. 

the volume of air inspired, the average rate of inspiration, and whether or not vaso-con- 
striction was produced. The above brief outline is sufficient for appreciation of the statistical 
problem, but a full account of the experimental procedure may be found in Gilliatt’s paper; 
the results discussed here are presented in his Fig. 5. 

The data, which Mr Gilliatt has kindly made available to the writer, were obtained from 
thirty-nine tests, in twenty of which vaso-constriction occurred. Tests were made on three 
different subjects, nine on D.W., eight on V.P.W., and twenty -two on S.J.S.; the results of 
the tests, with the subjects in this order, are shown in Table 1. In Fig. 1 are shown the 
thirty-nine combinations of volume in litres (V) and rate of inspiration in litres per second 



Table 1. Experimental data and details of calculations 



CQ 


li 
10 »0 



05 05 

§ss 


s 

I 



05 

co to 

IN 

■s a 

o» 

CO 05 

00 

cp r» 

eo 

do do 

6 

1 



D. J. Finney 


323 


(R), together with indications of whether or not the subject responded under these conditions. 
Inspection of Fig. 1 shows that, in general, when both V and B were small no response 
occurred, when either was large (unless the other was very small) the response occurred, and 
in an intermediate region the proportion of responses increased as either V or R increased. 
There was no sharply defined threshold separating combinations of V and R giving the 
response from those giving no response ; instead, there appeared to be a probability of response 
ranging from practical certainty under some conditions to zero under others. 

As an aid to fuller understanding of the influence of breathing on vaso-constriction, ex- 
amination of the relationship between V, R, and the probability of response seemed desirable. 
Since so few observations were available for each subject, the data were unlikely to be suffi- 
cient to show differences between subjects; this point is discussed later, but in the main 
analysis the distinction between subjects is ignored. For any form of response assessment, 
the testing of one subject many times must introduce a danger that the result of one test 
will be affected not only by its own stimulus but by preceding stimuli and by the effects they 
produced. In this investigation, each subject was given a number of preliminary tests until 
he appeared to have settled into the routine. The observations recorded in Table 1 were 
obtained after these preliminary trials; they are tabulated in the order of testing, and show 
no indication of effects of previous history, but clearly such effects would have to be very 
pronounced if they were to be detectable on this amount of data. 

3. Method of analysis 

Preliminary examination of the data suggested that the occurrence of a response was largely 
determined by the magnitude of VR, the product of volume and rate, curves on which the 
probability of response has a constant value being approximately hyperbolae of the form 

VR = constant. (1) 

A little consideration shows that an equation of this type is more reasonable than an equation 
linear in V and R, though the data are almost certainly inadequate for discriminating between 
many alternative types of relationship that might be postulated. A system of curves similar 
to, but rather more general than, equation (1), namely, 

VtxRfi » = constant, (2) 

was selected for trial; this equation may alternatively be regarded as representing a series 
of parallel linear relationships 

Pi l°g l 7 + Pt log R = constant (2a) 

between the logarithms of volume and rate for a fixed probability of response. 

A specified combination of V and R will not necessarily always give the same result 
(response or no response) with a subject, for, even though the subject is unaltered, minor 
uncontrolled variations in his environment may affect his susceptibility to the applied 
stimulus. For a particular value of V, the threshold value of R (the value which under the 
conditions prevailing at any instant would be just sufficient to produce a response) will 
have a frequency distribution; similarly, for a particular R, there will be a frequency dis- 
tribution of threshold values of V. If these distributions may be taken as normal in log V 
and log R, and, for simplicity, they are supposed to be such that the mean of either logarithm 
is linearly related to the seleoted value of the other, then the probability of response will be 
determined by an expression of the form 

Pi log V + Pi log R, 



324 


Individual records of dose and quantal response 


and the threshold values of this quantity will be normally distributed. If and x t are 
written for log(lOF) and log (10i?) respectively (the factor of 10 is introduced in-order to 
make x t and x t always positive), this statement enables the probability of response, P, to 
be expressed as 


-r. 




V(2tt) 


e~ iui du, 


( 3 ) 


where a, fi x and are parameters to be estimated from the data. The estimation may be 
regarded as the fitting of a probit regression plane, for F, the probit of P being given by 

F = 5 + a + fl 1 x 1 +fi t z t . (4) 

Substitution of the value of Y corresponding to a specified probability gives the required 
linear relationship, equation (2a), between x x and x % for that probability, from which the 
estimated curves of constant probability, equation (2), may easily be derived. 

The procedure for fitting a probit plane has been described elsewhere (Finney, 1943, 1947, 
§31), and its chief features need no alteration for application to individual records. Pro- 
viding that a first approximation to the equation can be guessed, repeated cycles of com- 
putation will give values for the parameters which approach more and more closely to the 
maximum likelihood estimates. Care in the choice of the first approximation will reduce 
the number of cycles needed; a poor choice will delay the convergence, though it will not 
affect the ultimate result. Since only a single observation is available for each combination 
of x x and x 2 , every working probit is either a maximum or minimum value, according to 
whether or not the response occurs. When there is only one dose factor, in the fitting of a 
probit regression line to records of individuals, grouping of doses and treatment of the 
observations in a group as if they related to an average dose may reduce the labour of the 
early computing cycles, but, since it will tend to give an underestimate of the regression 
coefficient, the final cycle may need to use the detailed records. Bliss (1938) has given an 
example illustrating grouping of this kind. Grouping is less easily applied, however, when two 
or more dose factors have to be used, and, for the data under discussion, the individual records 
were used throughout except in the formation of the first approximation. 

In the standard form of probit analysis, with moderately large numbers of observations 
at each level of dose, a y 2 is usually computed for testing the significance of discrepancies 
between the data and the fitted equation ; this y 2 is numerically the same as would be obtained 
by calculation from expected and observed numbers of responses and non-responses for 
each dose. If there are few observations in any dose group, the expected number of responses 
or of non-responses (or of both) is likely to be small, and, as is well known, y 2 may then fail 
to follow the sampling distribution tabulated for that statistic. Data of the type under 
discussion here are extreme examples of this situation, the number of observations for each 
dose being reduced to unity, so that any disturbance of the y 2 distribution is likely to be 
encountered in its most acute form. No complete theoretical investigation of this matter 
has yet been made, but the practical implications are discussed more fully in §5. 

On the assumption that the estimate of equation (4) is an adequate representation of the 
data, lines of constant response probability may be obtained for any specified probability; 
these may be plotted according to equation (2) on a F, B scale. Standard statistical processes 
also enable fiducial limits to be assigned to the position of any of these curves. The difficulty 
of dealing with the estimation of error for individual records, and the inadequacy of the 
data for any sensitive test of whether equation (2) is a satisfactory representation of the 



D. J. Finney 


325 


system of curves, throw doubts on the exact interpretation of these fiducial limits. Never- 
theless, they give some idea of the confidence that can be attached to the estimated curves, 
at least for moderate values of V and R; for extremes of either measurement, far more 
extensive data would be needed before much faith could be placed in the fitted equation. 

4. Computations foe estimating the thbee-pabameter equation 
In this and the two succeeding sections, the computations for Gilliatt’s data will be described 
in detail. The first five columns of Table 1 show the thirty -nine pairs of values of V and R 
which occurred in the experiments, followed by the corresponding values of x t and x t , 
together with a statement of whether or not the subject responded. Before the probit com- 
putations could be initiated, a first approximation to equation (4) was needed; this was 
obtained with the aid of the suggestion, from the plotting of the data shown in Fig. 1, that 
the constant probability curves were approximately the hyperbolae of equation (1), or 
alternatively x 1 + x 2 = constant. 

As Bliss (1938) has pointed out, there is no objection to the use of overlapping groups in the 
formation of the first approximation. The data were therefore grouped according to the 
value of (x x + x 2 ), as shown below, and the proportion of responses in each group was obtained 
from Table l : 


x t + x t 

Responses 

' 

Proportion 

(P) 

Probit 
of p 

First 

approximation 

1-6-1-0 

0/7 

000 

_ 

3*3 

1(1-20 

0/7 

0*00 

— 

3*6 

1*7-21 

2/7 

0*29 

4*4 

3*9 

1 *8-2*2 

2/9 

0*22 

4*2 

4*2 

1 *9-2*3 

3/14 

0*21 

4*2 

4*5 

20-2*4 

8/19 

0*42 

4*8 

4*8 

21-2*5 

13/24 

0*54 

5*1 

51 

2*2-2* 0 

17/25 

0*08 

5*5 

5*4 

2*3-2*7 

16/17 

0*94 

6*0 

5*7 

2*4~2*8 

12/12 

1*00 


0*0 


Each proportion was regarded as an estimate for the median value of (x x + x 2 ) in the group, 
i.e. 1-7, 1-8, 1-9, ..., and its probit was read from one of the standard tables (Finney, 1947, 
Table I; Fisher & Yates, 1947, Table IX). As may be seen above, these probits were fairly 
well fitted by the guessed equation 

Y = -1-8 + 3^ + jg, (5) 

which was therefore used as a first approximation to equation (4). 

A first set of expected probits was calculated from equation (5), and inserted as Y in an 
earlier version of Table 1. A cycle of routine probit calculations, just as described in the next 
two paragraphs, then led to an improved approximation to the required estimate, on which 
a second cycle of improvement was based. The figures shown in Table 1 relate to the fourth 
of these cycles, based upon the approximation 

Y = - 9- 127 + 6-666.^ + 5-906^ (6) 

from the third cycle. Equation (6) is very different from equation (5), suggesting that more 
care might have been given to the selection of a first approximation; that the grouping 



326 


Individual records of dose and quanted response 

adopted would lead to underestimation of the regression coefficients was expected, but 
insufficient allowance for this was made. Of course the ‘improvement ’ in the approximations 
refers to their approach to the solution of the maximum likelihood equations, and is not 
necessarily always an approaoh to the true relationship. 

The column of expected probits, Y, in Table 1 was calculated by substitution of pairs of 
values x lt x a in equation (6); one deoimal place here is quite sufficient. The weighting coeffi- 
cient, w, for each observation was then read from tables (Finney, 1947, Table II; Fisher & 
Yates, 1947, Table XI) and entered in its oolumn. The working probit, y, takes a maximum 
value for every observation giving a response and a minimum value for every observation 
giving no response, since these give empirical rates of 100 % and zero respectively; values 
of y were read directly from Finney’s table ( 1 947, Table III ; or, less simply for the minimum 
values, from Fisher & Yates, 1947, Table XI ). The numbers of deoimal places shown for the 
entries in Table 1 are sufficient for data of this type; indeed possibly one decimal for w and 
for y would be enough. Columns wx v wx a , and wy were then filled, and the weighted sums of 
squares and products of deviations, required for the calculation of the regression of y on 
and x 2 , were completed at the bottom of the table. 

The equations giving the estimates of the regression coefficients, b x and 6 2 , are 

0-4945286 x -0-3827296 2 = 1-032130, 

-0-3827296! + 0-51 77 146 2 = 0-516978. 

Later calculations use the variances and covariance of 6 X and 6 2 ; the equations were therefore 
solved by first obtaining the matrix inverse to that formed by the coefficients of 6 X and 6 2 
(Finney, 1943, 1947, §31; Fisher, 1946, §29). This matrix is 

y = (>’n v 12 \ /4-726144 3-493883\ 

\v 12 vj \3-493883 4-514482/’ 1 ' 

the'accuracy of the data is insufficient to need the number of decimal figures shown here, 
but their retention assists the checking and maintains the internal consistency of the 
analysis. Now 6 X = l-032130t> n + 0-51 697 8r 12 

= 6-68426, 

and similarly b 2 = 5-94003. 

The estimate of equation (4) is then 

Y = y + 6!(x!-*i) + 6 2 (x 2 -.f 2 ) 

or Y = - 9-182+ 6-6843xj + 5-9400a: 2 , (8) 

a result which differs little from equation (6) and may be regarded as a sufficiently close 
approximation to the maximum likelihood estimate. Since 

6 2 /6j = 0-889, (9) 

equation (8) may be transformed to give 

V i? 0-889 = constant (10) 

as the relationship estimated to exist between V and R for a specified probability; the value 
of the constant can be obtained by substitution of the probit of the probability in equation 
(8), a process which gives 1*10, 1-36, 1*71, 2-16 and 2-66 for probabilities of 10, 25, 60, 75 
and 90 % respectively. Typical contours have been drawn in Fig. 1 so as to indicate the form 
of the relationship. 



D. J. FiNiraY 


327 


5. Goodness of fit 

When probit analysis is applied to data oontaining many observations in each dose group, 
the weighted sum of squares of deviations between the empirical probits and the predictions 
from the fitted equations is a x*, with degrees of freedom equal to the number of dose groups 
reduced by the number of fitted parameters. If S uv is written for the weighted sum of products 
of deviations of variates u and v 9 application of this method here would give 

Afse] = 8 yv — b x S XlV — b 2 S XtV 

= 40*045 — 6*6843 x 1*0321 — 5*9400 x 0*5170 
= 30*08. (11) 

When the dose groups are small, however, the x 2 so calculated cannot be trusted as an in- 
dicator of the significance of deviations from the fitted equation, and it is presumably most 
unreliable when each group is reduced to a single observation. Apart from slight discrepancies 
caused by imperfect approximation to the maximum likelihood solution, the x 2 in equation 
(11) is algebraically identical with that which would be derived, by the usual form of cal- 
culations, from comparison of observed numbers responding and not responding in each 
group with expectations computed from the fitted equation. As is well known from the 
study of contingency tables, when the expectations in some classes are small the sampling 
distribution of such a x 2 ma y be very different from that shown in the standard tables 
(Finney, 1947, Table VI; Fisher & Yates, 1947, Table IV); with data from individual records, 
no class can have an expectation greater than unity, and for many the expectation will be 
very much less, so that the discrepancy from the tabulated x 2 distribution is likely to be 
serious. 

The general effect of small expectations on the random sampling distribution of x 2 appears 
to be that the mean value remains about equal to the number of degrees of freedom, but that 
the variance in repeated sampling is increased. Consequently, samples from a population 
according with the null hypothesis are likely to show an excess of very high and very low 
values, as judged by the tables of x 2 - Thus thereis little danger that significant evidence of 
deviations from expectation will be overlooked in an uncritical application of the test, though 
apparently significant values of x 2 need to be examined with care before they are regarded 
as evidence sufficient to justify rejection of the null hypothesis. Low values, as in Gilliatt’s 
data 30 with 36 degrees of freedom, need cause little alarm, for they clearly indicate no 
serious deviation from expectation. High values may in the first instance be compared 
with the standard tables of the x 2 distribution; if they fall beyond the significance level, a 
closer examination should be made before judging the null hypothesis to be untenable, for 
the apparent significance may be due to large contributions from one or two aberrant points. 
Gilliatt’s data provide an illustration of this. The expected probits for each pair of values 
of x 1 and x 2 in Table 1 have been calculated from equation (8), and the probabilities, 
P ( = 1 — Q) 9 corresponding to these have been entered in the last column of the table; 
P is then the expectation of the number of responses for each dose. The x 2 obtained from the 
observed and expected numbers in seventy -eight classes is easily seen to be the sum of QjP 
for all doses giving a response, plus P/Q for all giving no response. Inspection of the column 
for P shows small contributions to x 2 everywhere, except for two instances of responses with 
probabilities of only 0*098 and 0*128, contributing 9 and 7 respectively ; clearly the occurrence 
of these two responses as the most extreme events in thirty-nine trials need not be regarded 

Biometrika 34 


22 



328 


Individual records of dose and quantal response 

as serious evidence against the null hypothesis. The result of calculating x 2 by this more 
laborious process is a total of 303, which agrees closely with that already given in equa- 
tion (11). 

One method of modifying a x 2 test so as to remove its extreme sensitivity to deviations 
from small expectations is to combine expected and observed frequencies over several 
adjacent groups, so as to obtain groups with larger expectations; the number of degrees of 
freedom is then taken as the number of remaining groups less the number of fitted para- 
meters. Of course the groups must be chosen objectively, and without regard to the agree- 
ment between the frequencies. The statistic still will not follow the x 2 distribution exactly, 
but the approximation should be fairly satisfactory under the usual restriction that the 
groups be so chosen that none of the expected frequencies is small. This procedure often has 
to be adopted in probit analysis because of small expectations at very low or very high doses 
(Finney, 1947, § 18). With individual records, however, only very extensive grouping will 
give expectations sufficiently large for the x 2 test to be trusted ; the reduction of a large x 2 
to a value below the significance level might then appear indicative of an insensitive test 
rather than of absence of serious discrepancies.* 

Probably no completely satisfactory solution of the difficulty is to be expected. Individual 
records usually arise from experimental work in which the obtaining of large numbers of 
observations presents considerable difficulty. Often the whole series will consist of less than 
fifty observations, and, unless previous information enables the range of doses to be chosen 
satisfactorily, many of the observations will be made at doses for which response is either 
almost certain or almost impossible. Even if the individual dose-tolerances could be measured 
directly, a test of normality of their distribution (which is what- the x 2 test attempts to 
provide) could not be very sensitive when based on only fifty measurements; if, instead, 
only quantal data are available, indicating merely whether a dose is below or above the 
tolerance value, a sensitive normality test is still less likely to exist (Finney, 1947, §43). 

Gilliatt’s data, a series of only thirty-nine observations, provide an extreme instance of 
the difficulty of formulating a sensitive test of goodness of fit. Nevertheless, an attempt has 
been made to examine the discrepancies between the observations and the null hypothesis 
expressed by equation (4). In Table 2 are compared the observed and expected frequencies 
when the data are grouped according to the value of VR 0 ' 889 . This is equivalent to a grouping 
based on the value of F, the expected probit in equation (8), and, as this quantity had been 
evaluated for each observation in order to give P, it was used in the construction of Table 2. 
Since three parameters have been estimated from the data, four groups is the least number 
for giving a x 2 test. The limits of the groups were chosen so as to give similar numbers of 
observations in each. Inspection of Table 2 shows that the groups are still too small for a x 2 
test to be trusted, thus suggesting that the data are inadequate for any useful test of goodness 
of fit to be made. The only anomaly in Table 2 is the occurrence of two responses where the 
expectation is 0*3, and this is clearly insufficient to cause much worry. 

The inadequacy of the data for detecting any differences in sensitivity between the three 
subjects may be seen from Table 3. The first nine entries in Table 1 relate to D.W., and sum- 

* In his discussion of the analysis of individual records, Bliss (1938) suggests adjustment of the x 2 
test, not by altering the calculation of the statistic but by reducing the number of degrees of freedom 
allotted to it; he gives an empirical rule for the reduction, based upon the expectations in terminal dose 
groups. This method, however, not only lacks any theoretical basis, but seems liable to have an effect 
opposite to that which is needed; it will attribute significance to high values of x 2 even more readily 
than will the unadjusted test. 



D. J. Finney 


329 


mation of the values of P gives the expected number of responses for this subject; similarly 
the next eight and the last twenty-two entries give the numbers for V.P.W. and S.J.S. 
respectively. Inspection of Table 1 shows that the tests on each subject were fairly widely 
distributed over the range of values of and x t . Table 3 shows excellent agreement between 
totals of observed and expected responses for each subject, thus suggesting that any in- 
dividual differences that exist are small by comparison with the variation in sensitivity of 
the same subject in different tests. 


Table 2. Comparison of observed and expected frequencies of response 





Frequencies of results 




Observed 

. . - + 

Total 


Expected 

+ 

—4 

8 

2 

10 

1 

9-72 

0-28 

4-5 

6 

0 

« 

l 

3-92 

2-08 

5-6 

5 

8 

13 ! 


4-26 

8-74 

6- ; 

0 

10 

io I 


0-59 

9-41 

Total 

19 

20 

39 


18-49 

20-51 


Table 3. Comparison of subjects 


Frequencies of results 


Subject i 

Observed 

! ~ + 

1 

Total 

Expected 

+ | 

D.W. 

3 6 

9 

1 

4-0 5-0 ! 

V.P.W. 

4 4 

8 

3-5 4-5 i 

S.J.S. 

12 10 

22 

110 110 | 

; I 

Total 

19 20 

. 

' ■ ! 

39 

1 

1 j 

18-5 20-5 


6. Limits of error 



The variances of b x and b 2 and the covariance between them are respectively v lv v 22 and v 12 
as defined in equation (7). Hence the variance of Y , the expected probit corresponding to 
any pair of values x v x 2 , is 

J' ( I ) ^ + v n(#i *“ #i ) 2 "f* — # 1 ) (x 2 — x 2 ) 4 - ^22(^2 “ *^2) a > ( 12 ) 

where Sw is the sum of the w column in Table 1 . All these variances are derived from binomial 
probability distributions. In the usual form of probit analysis, with a batch of subjects at 
each dose, the precision of the estimated relationship between dose and response is discussed 
as though the variation were normal, an assumption which is justifiable on account of the 
large numbers of individuals involved. Here, with only thirty-nine observations in all, the 


22-2 



330 


Individual records of dose and quanta! response 

assumption is less safe, but may be adopted for laek of any more trustworthy method of 
dealing with the data. It is unlikely to be seriously misleading, except possibly for extreme 
levels of the response probability, P. 

Equation (12) may now be used in the assignment of fiducial limits to any one of the curves 
of equal probability given by equation (10). For suppose that t is the normal deviate corre- 
sponding to the significance level to be used in defining the fiducial limits, and that Y 0 is the 
probit of a probability P 0 . Then for any values of x v » s for which 

(y-y 0 )*>i*F(F), 



Volume of inspiration (litres) 

Fig. 2. Fiducial limits (5 % probability) to 05 frequency contour of Fig. 1. 
O no vaso -constriction; • vaso-constriction. 


where Y is determined from equation (8), the expected probit differs significantly from Y 0 , 
and for values of x v x 2 which reverse the inequality the difference is not significant. Therefore 
the equation ( y _ = <2 p ( Y ) (13) 

gives the limiting values of (x v x 2 ) for which the null hypothesis that the true expected probit 
is I, is not untenable in the light of the data; in other words, equation (13) defines curves in 
the (x v x t ) plane which are fiducial limits to the estimated locus of points having a constant 
response probability P 0 . These curves are clearly hyperbolae. In Figs. 2 and 3, the 5 % 
fiducial limit curves (t — 1*960) for P 0 = 0-6 and P n = 0-9 respectively have been plotted in 



D. J. Finney 331 

the (V, R) plane; details of the calculation need not be given here, but Fig. 2, for example, 
is derived from the equation 

(14* 1 82 — 6*68432! — 5*9400*, ) 8 = 3*841 ^y^ + 4*7261(* 1 - 1*0496)* 

+ 6*9878(5^- 1*0496) (x 2 - 1*2483) + 4*6146(* 2 - 1*2483)*J . 

The pairs of curves are like hyperbolae in form. That for P 0 = 0*6 defines a band on either 
side of the estimated relationship which is quite narrow for moderate values of V and R 



Fig. 3. Fiducial limits (6 % probability) to 0*9 frequency contour of Fig. 1. 

O no vaso -constriction; • vaso-constriction. 

though naturally it widens considerably at the extremes. That for P 0 = 0*9, as might be 
expected from general consideration of the problem, allows much greater uncertainty on the 
side of high values of V and R\ similarly, for P 0 — 0*1, that band would be relatively wider 
on the side of low values of V or R. 

The curves shown in FigB. 1, 2 and 3 may be regarded as plane sections, for selected values 
of Y, of a three-dimensional diagram relating Y to V and R. In terms of x v x 2 instead of 
V, R, this diagram is the three-dimensional analogue of the familiar diagram showing a 
regression line with hyperbolic curves indicating limits of error on either side; the line 



332 


Individual records of dose and quanted response 

generalizes to a plane, and the limits are now defined by two sheets of a hyperboloid, one 
above and one below the plane. 

The theoretical basis of the curves illustrated in Figs. 2 and 3 is perhaps insecure, but 
undoubtedly they give a useful indication of the dependence of the probability of response 
on V and R and of the reliability of the estimation of this relationship. Much as an experi- 
menter might wish for a more precise assessment of the effects of V and R, experience sug- 
gests that results such as those obtained here are as good as can be expected from a total of 
thirty-nine quantal observations. 

7. The two-parameter equation 

In § 3, the equation VR = constant ( 1 ) 

was suggested as an expression of the curves of constant response probability, but the more 
complex equation (2) was adopted for use in §§4-6. There are no theoretical reasons for 
believing that equation (1 ) represents the true form of the relationship, and the more general 
form was chosen in order that the complete calculations might be illustrated. The values of 
b x and b 2 obtained, however, do not differ very greatly by comparison with the standard 
error of their difference; in fact 

V(b i-bj = %-% + % 

= 2-253, 

and therefore b x — — 0-744+ 1-501. 

In the absence of any significant difference between the regression coefficients, the common 
scientific procedure of preferring the simpler hypothesis (Occam’s Razor) suggests that 
equation (4) might be replaced by 

Y = a + fi(x 1 + x 2 ). (14) 

For the estimation of equation (14), the computations are similar to, but shorter than, those 
of § 4, since (x x + x 2 ) may be replaced by a single variate, x, and a simple regression calculated ; 
the calculations in §4 were used to give a first set of expected probits, from which was 
derived the estimate Y = - 9-475 + 6-4067(x, + x 2 ). (15) 

Only two parameters have been estimated from the data, and calculation as for equation 
( n )g ive8 yf 871 = 28-76. 

The difference between the two x 2 values may be taken as a further criterion of whether or 
not the extra parameter is needed, closely related to the test of significance of (b 1 — b 2 )\ 

X&j = 1*32 

is not significant, though again the validity of the y 2 test is in doubt. 

Substitution of the probit of a specified probability in equation (15) gives the value of the 
constant in equation (1). For the 50 % response probability, for example, the constant is 
1-82; over the range of values tested, the curves 

jTgo-889 _ 1.71 and VR = 1-82 

differ only slightly. Similarly, fiducial limits to (x x + x 2 ) may be calculated, for any Y 0 , as 
upper and lower values of the product VR. No special interest attaches to these calculations; 
the novelties due to the individual records are exactly as for the three-parameter equation 
discussed in earlier sections, and otherwise the method is entirely that of ordinary probit 
analysis (Finney, 1947, Chapter 4). For comparison with the three-parameter equation, 
diagrams similar to Figs. 2 and 3 may be prepared; both the constant probability curves and 



D. J. Finney 


333 


the fiducial limits are then true hyperbolae. Fig. 4 shows the results for a 60 % response 
probability, and is to be compared with Fig. 2. The constant probability curve in Fig. 4 
differs little from that in Fig. 2, though naturally the difference increases for large values of 
V or B where the curves are less well determined. For moderate values of V and S, the fiducial 



Fig. 4. Contour of dose -response surface for 0-5 frequency of response, estimated from two-parameter 
equation, and its 6 % fiducial limits (compare Fig. 2). O no vaso -constriction; • vaso -constriction. 

limit curves are practically the same as the corresponding curves in Fig. 2, but for more 
extreme values they lie much closer to the curve of constant probability; since the data 
show no significant difference between b x and b 2 , it is to be expected that a more precisely 
estimated relationship between stimulus and response will be obtained if an assumption that 
Pi = is made, so that the information on the two regression coefficients can be combined, 
and this shows itself by narrowing the zone of error for the constant probability curve. 

8. Summary 

The method of probit analysis has been developed to assist the study of the relationship 
between the magnitude of a stimulus and the proportion of tests in which a particular 
quantal response to that stimulus appears. In some research problems, the stimulus cannot 
be controlled sufficiently to make possible the administration of a specified magnitude, 
though the stimulus actually received by any one subject can later be measured. It will 
then seldom happen that two subjects receive exactly the same ‘dose’, and the data for 



334 Individual records of dose and quantal response 

statistical analysis will generally consist of a series of doses with, for each, a statement of 
whether or not a single subject showed the characteristic response. 

Even for data of this type, the probit transformation oan aid the estimation of the relation- 
ship between dose and the probability of response. The calculations leading to the estimate 
are more tedious than is usual in probit analysis, because of slow convergence from a pro- 
visional equation to the final form, but follow the usual pattern. The validity of the x * test 
of goodness of fit (in reality a test for the normality of distribution of individual tolerances) 
must be doubted, however, since the disturbance due to small class numbers will be en- 
countered in its most extreme form. Extensive grouping of results for adjacent doses will 
provide a test less open to objection, though this will generally be insensitive to all but the 
grossest deviations from normality; indeed, no valid sensitive test is to be expected with 
individual records unless these are very numerous. 

In this paper, the calculations have been illustrated on data relating to a reflex vaso- 
constriction which sometimes occurs in the skin of the digits of human subjects after a single 
deep breath. The relationship between the occurrence of this response and two dose factors, 
the volume and the rate of inspiration, has been estimated for the combined records from 
three subjects; inclusion of two dose factors complicates the analysis, since a bi- variate 
regression equation must be fitted, but does not affect the underlying theory. The x I 2 test 
has been discussed at length, though there is no indication of non-normality or of hetero- 
geneity of the data. The reliability with which the dependence of the probability of response 
on the dose factors is estimated has also been examined, and curves bounding fiducial 
regions, within which the true probability contours may confidently be asserted to lie, have 
been determined. This method of representing the limits of error is applicable to other forms 
of probit analysis involving two dose factors and is not restricted to individual records, 
though it has not previously been described. 

I am indebted to Mr R. W. Gilliatt, of the Department of Physiology, both for permission 
to make use of his data in an illustration of the statistical methods of my paper and for 
assistance in describing his experimental procedure. My thanks are due also to Miss M. Callow, 
who prepared Figs. 1-4. 

REFERENCES 

Bliss, C. I. (1934a). The method of probits. Science, 79, 38-9. 

Bliss, C. I. (19346). The method of probits — a correction. Science , 79, 409 10. 

Bliss, C. I. (1935a). The calculation of the dosage -mortality curve. Ann. Appl. Biol. 22, 134-07. 
Bliss, C. I. (19356). The comparison of dosage -mortality data. Ann. Appl. Biol. 22, 307-33. 

Buss, C. I. (1938). The determination of dosage-mortality curves from small numbers. Quart . J. 
Pharrn. 11, 192-210. 

Bolton, B., Carmichael, E. A. & StOrup, G. (1930). Vaso -constriction following deep inspiration. 
J . Physiol . 86, 83-94. 

Finney, D. J. (1943). The statistical treatment of toxicological data relating to more than one dosage 
factor. Ann. Appl . Biol . 30, 71-9. * 

Finney, D. J. (1947). Prohit Analysis: A Statistical Treatment of the Sigmoid Response Curve . Cam- 
bridge: University Press. 

Finney, D. J. (1948). The principles of biological assay. J.R. statist . Soc. Suppl. 9, 40-91. 

Fisher, R. A. (1940). Statistical Methods for Research Workers , 10th ed. Edinburgh: Oliver and Boyd. 
Fisher, R. A. & Yates, F. (1947). Statistical Tables for Biological , Agricultural and Medical Research , 
3rd ed. Edinburgh : Oliver and Boyd. 

Gaddttm, J. H. (1933). Reports on biological standards. III. Methods of biological assay depending 
on a quantal response. Spec. Rep. Sex. Med. Res. Coun. t Land., no. 183. 

Giluatt, R. W. (1947). Vaso-constriction in the finger following deep inspiration. J. Physiol, (in the Press). 



[ 335 ] 


A POWER FUNCTION FOR TESTS OF RANDOMNESS 
IN A SEQUENCE OF ALTERNATIVES 


By F. N. DAVID 


1. During recent years attention has been focused on what might be called the ‘group’ 
test for randomness in a sequence of alternatives. Thus, if E denote the happening of an 
event, and E its negation, the number of alternations of E and E in a sequence supposedly 
random has been chosen as a test criterion. This test has been put to different uses by 
W. L. Stevens (1939), A. Wald & J. Wolfowitz (1940) and F. N. David (1947). It seems worth 
while therefore to enquire what is the power of this test against a set of specifically defined 
alternate hypotheses. The hypothesis to be tested will be that there is randomness within 
the sequence, with the alternate hypothesis that if there is no randomness then there is 
dependence of the type found in a simple Markoff chain. The same procedure will hold good 
for dependence of the types found in more complex chains although in these cases the 
enumeration is a little troublesome. 


2. If there is a sequence of dependent events 

E v E 2 , E a , .... E n , 

then it is an elementary proposition of the probability calculus that 

P{E 1 E 2 E a ...E n - 1 E n } = P{E l }P{E t | E,}P{E Z | E\E 2 }... P{E n \ E^... E^}. 

If the events are independent, then 

P{E 1 E 2 E,... E n _,E n } = PIE,} P{E % ] \ P{E a } ... P{E n }. 

This relation will be the basis of H 0 , the hypothesis to be tested. If there is dependence as 
in a simple Markoff chain, then mathematically each event will be dependent on the event 
immediately preceding it, but will be independent of any of the other events. In this case 
we shall have 

P{E,E,E, ...E n _M = P{E,} P{E, | E,} P{E a | E a } ... P{E n | E n _,}. 

This relation will be the basis of H , , the hypothesis alternate to H 0 . 


3. For the hypothesis, H 0 , let the probability that an event E will occur in a single trial 
be p, and let the probability of E (the negation of E), be q, where p + q = 1. The probability 
of obtaining any given sequence of r, E’b and r a E' s will be 


p r iq r *. 

The number of ways in which r, E’s and r 2 E’b may be arranged to form 2f and 2t + 1 sets of 
E’b and E’b alternately is 


" (l-ljtfl-OKr.-OUr.-OI 


+ r t -2t 
21 


Writing fc = 2t or 2< + 1 as desired, the probability of obtaining a sequence of r, E’s and r % E’s 

arranged in k sets is ... / fc 

P{k\r v r a ,H 0 } Sj) r 12 r,y ( £/,' 

A1U AIK 

k may take values 2, 3, ... , 2r 2 , if r, = r 2 , and values 2, 3, . . ., 2r g + 1 if r x > r 2 . 



336 A 'power function for tests of randomness in a sequence of alternatives 

4. Following the orthodox procedure, in order to test the hypothesis, H 0 , it is neoessary 
to find two numbers k x and k t such that 

PikKk^H,}^, P{k>k a \H 0 }^y, 
and therefore P{fc x ^k^k^^l — e, 

where e is a number arbitrarily at choice. If an observed number of sets, say k', falls outside 
the limits k 1 and k t then the hypothesis H 0 will be rejected in favour of some alternate 
hypothesis, H v Alternately if H 0 is not true, but H x is, then 

\-P{k x ^k^k i \H 1 } 

will be the power of the test in the sense of the word as used by Neyman & Pearson. Whether 
k x or k a is chosen to judge the significance of an observed k' will depend on which departure 
from randomness it is most important not to overlook. If the alternate hypothesis is that 
there is positive dependence in the chain, i.e. that E having occurred in the sth trial it is 
more likely to occur in the (s + l)st trial, then k x would be chosen. Such a situation was 
envisaged in a proposed smooth test to supplement the y 2 criterion (David, 1947). If, however, 
the alternate hypothesis is that there is negative dependence, i.e. that E having occurred in 
the sth trial, it is less likely to occur in the (s+ l)st trial, then k 2 would be the appropriate 
criterion. If it is immaterial whether the departure from randomness is positive or negative 
dependence, then both k\ and k a may be used. 

6. We now consider the alternate hypothesis, H v Write E s for the occurrence of the 
event E in the «th trial and E s for its negation. Let 

P{E X ) = P, P{E t ) \ = Q, P + Q = 1 and P^Q. 

P{E e \E J _ 1 } = p 1 , P{E S | E g _ x ) = q v 
P{E g \E,^=p 2 , P{E a \E a _ x } = q a . 

Thus p x and q 2 are probabilities of no change and p 2 and q x probabilities of a change. If the 
events are independent then 

Pi = P* = P and ?i = = Q- 


6. In calculating the probability of obtaining any given sequence, what will matter will 
be the number of changes from E to E and back again. Let f t (r x ) be the number of ways in 
which r x E’b can be arranged in t groups, i.e. let 

f{r ) = Ml! 

J,y 11 (t — 1)! {r l — 1)\' 

If there are 21 groups in a sequence of r x E's and r 2 E’b, the number of ways of obtaining such 
a sequence will be ft(r x )f,(r 2 ) 

if the sequence starts with E or with E. The probability of obtaining any given sequence of 
r x P’s -and r t E ’ s of 2 1 groups will be 

Pp'f'P?- 1 ^-' or Qqi^q&^piPi 1 ' 1 - 

This follows from the fact that a sequence of 2 1 groups beginning with E will imply t changes 
from EtoE and t — 1 changes from E to E. The changes are reversed in number if the sequence 
starts with E. For 2t + 1 groups the number of ways of obtaining the sequence will be 

ft+i(ri)ft(r 2 ) or f t (r x ) f t+1 (r 2 ) 

according as the sequence begins with E or E. The respective probabilities will be 

PPiPi^'^fifi^ 1 and 



F. N. David 


337 


The probability therefore of obtaining a sequence of r x E’s and r 2 E’a in 2 1 groups will be 
therefore, under hypothesis H x , 


< 

'Ps9i\ 

\Pi<h) 


\Pl 9v 

\ 

r% i 
2 

'Mi)* 

\Pi9zl 


<P 

\Pz 

+ s; 

| + ~f l Mf t (r 2 ) + %f l (r x )f M (r 2 )l 
1 Pi <h J 


P{2t | r x r 2 H x } = — 

S 

i 

The probability of obtaining r x E’s and r 2 E ' s in 2< + 1 groups will be similarly 
P{2t + 1 1 r x r 2 H x ) = — 


d 

PggiV 

Pi9-J 

[£/<+ 

?2 J 


r * i 
2 

'M\\ 

\Pi<]J 

[/iW/iM 

P + Q \ 

\Pt Qi) 

| + “/w(r 1 )/^r,)4 

Jr 1 



7. So far no mention has been made of any possible connexion between p v q v p 2 and q 2 . 
It is obvious in all cases we shall have 


Pi + g x =l, p 2 + ? 2 = 1, 


but the connexion betweenp x andp 2 is not immediate. We shall make the simplifying assump- 
tion which is perhaps most closely related to practical problems, and shall state that where 
nothing is known about the *— 1 trials preceding the sth trial, P{E 8 } = P and P{E 8 } = Q. 
Under this assumption we have 




This result is reached easily by noticing that 

W = p{e s e 8 _ x }+p{eX_ x } = 

whence P = Pp x + Qp 2 . 


8. The alternative hypothesis chosen to illustrate the power function formulae is that 
there is positive dependence in the sequence, i.e. k x is found so that 

P{k ^k x \H 0 }^e and 1 -~P{k > k x | H x } 

is calculated, when p x ^ P. For economy of drawing, several power curves or what are really 
sections of a kind of power surface, plotted to coordinates P , p v have been put together in 
the diagrams of Fig. 1. For example the bottom left-hand diagram shows for r x = r 2 = 10 
sections of the conditional power surface for P = 0*5, 0-6 and 0-75. When H 0 is true and 
P = p v we have the 5 % risk of rejecting Hq wrongly. As p x — P increases the chance of 
detecting the fact increases, but in a way dependent on P. The other three diagrams show 
similar sections of the surfaces with r x = r 2 = 5, with r x = 14, r 2 = 6 and with r x = 7, r 2 = 3. 
In practice it will not be known what the value of P is, but the curves show reasonably well 
how the power of the test varies as P and p x (and therefore p 2 ) vary. It is clear that the test 
for randomness under discussion is most powerful when the numbers of alternates are equal, 
i.e. when r x = r 2 . The power declines sharply when r x increases at the expense of r 2 . 
Another point which emerges is that the test is only moderately powerful, against the given 
alternate hypothesis tested, when r x 4- r 2 = 20, and it would appear therefore that if it 
was desired not to overlook a possible departure from randomness in the form of positive 
dependence in the chain, then the length of the sequence should consist of at least 20 units. 
The question of other possible tests we shall not discuss at this stage. 



Power of test Power of test 


338 A power function for testa of randomness in a sequence, of alternatives 


Sequence of 20, = 14, r t = 6 Sequence of 10, r t = 7, r, = 3 



Scale for P and p 1 Scale for P and p. 


Sequence of 20, r x = r 2 a= 10 


Sequence of 10, r x = = 5 



Fig. 1. Conditional power curves when the alternate hypothesis is positive dependence. 







330 


F. N, David 


9. It will be noticed that P{2t or 2i + 1 1 r 1 r i H 1 } which have been loosely termed power 
function formulae are not power functions in the sense originally defined by Neyman & 
Pearson, but they appear to involve a justifiable extension of that idea. In order to dis- 
tinguish them from the usual meaning of the words power function, I shall refer to them as 
conditional power functions. The theory of the conditional power function may be stated 
briefly in the following way. It is assumed that all possible samples (or sequences) may be 
classified according to their composition. Suppose that there are k of these mutually exclu- 
sive classes, which are also the only possible, say C v C 9 , C k . We have considered only the 
case where k is finite but it appears likely that the method can be extended to cover the case 

where k is enumerably infinite. These classes, C v C 2 G k will correspond to regions forming 

a partition of the sample space. 

Let Hq be the hypothesis tested and w 0 be the critical region used for the rejection of this 

hypothesis. Given that a sample is in C t (say), and that an alternate hypothesis H x is true, 

then the probability that H 0 will be rejected is 

Df p™.. n l TJ i P{Eew 0 C { I H x } 

P{Eew 0 G { | EeC { , H x ) = 

where w 0 C { means the region common to w 0 and to C\ and, following the Neyman-Pearson 
notation, E is the sample point. Regarded as a function of H x this is the conditional power 
function of the test associated with w 0 in the subset C t of samples. 

The Neyman-Pearson power function, which we might call here the overall power function, 

will be k k 

P{Eew 0 1 H x } = £ P{Eew 0 C< I tfj = £ P{Eew 0 C f I EeC i9 H x ) P{EeC t | H x }, 

i-i i - i 

which may be looked on as a weighted average of the conditional power functions. 


10. There seems to be no reason why w 0 should not be built up of portions w 0 C i7 these 
portions being chosen to maximize each term of the summation, i.e. w 0 C { chosen to maximize 
the conditional power function. For example, to revert to the specific case of randomness 
within a sequence with which we have been dealing, the different partitions of r ( = r x 4- r 2 ) are 
the mutually exclusive and only possible classes < 7 t . It is conceivable, although practically 
not very likely, that for each of these classes there will exist a different test which is more 
powerful to detect specifically defined departures from the basic hypothesis tested than 
any other test. The decision as to which is the most powerful test, against the same specifically 
defined alternatives, to use for any given class will be decided by the conditional power 
function. Once this has been decided the procedure for the complete test of significance may 
be laid down. This will be : (i) count the number of alternatives in the sequence, i.e. find r x 
and r 2 , (ii) from (i) decide the appropriate test of significance to use, (iii) apply the test. The 
power of the test as laid down by (i), (ii) and (iii), in the usual meaning of the word, will be 
given by the overall power function. 

It is proposed to discuss these, and other applications of the conditional power function 
technique, in a further publication. I have been concerned here with trying to explain what 
I believe to be the basic ideas, arid to forestall possible criticism that I am falling into error 
(of the third kind) and am choosing the test falsely to suit tho significance of the sample. 


REFERENCES 

* 

Stevens, W. L. (1939). Ann . Eugen ., Land., 9, 11. 

Wald, A. & Wolfowitz, J. (1940). Ann . Math. Statist . 11, 147. 
David, F. N. (1947). Btomeirika , 34, 299. 



[ 340 ] 


A NUMERICAL SOLUTION OF THE PROBLEM OF MOMENTS 
By H. O. HARTLEY and S. H. KHAMIS 
1. Introduction 

Given a statistical variable x and its frequency distribution f(x), then, under certain con- 
tinuity conditions for f{x), the moments 

/i r = jx r f(x)dx (r= 0,1,2,...) (1) 

can be evaluated for any integer r. For certain distributions f(x) the integrations in (1) can 
be carried out analytically resulting in simple formulae for the moments. In general there 
is no inherent difficulty in obtaining numerical values for the moments by numerical 
quadrature. 

The inverse problem is to find the distribution f(x) given the moments / i r . This problem, 
commonly known as ‘The Problem of Moments’, has received considerable attention by 
mathematicians and is of interest in statistical distribution theory. There are numerous 
statistics for which it is difficult to obtain a formula of the random sampling distribution 
f(x) amenable to numerical evaluation. On the other hand, in such cases it is often possible 
to find simple formulae for the random sampling moments (Bartlett, 1 937). Sometimes such 
formulae are available for all integer r; more often than not, however, /i r is only known for 
a limited number of small r (e.g. r = 0, 1 , . . . , 6). A simple method of 4 determining ’ f(x) from 
the given moments would therefore be helpful in such cases. 

Examples of variables of this kind are the numerous moment statistics or ^-statistics for 
which random sampling moments can be evaluated, notably by R. A. Fisher’s (1929, 1930) 
combinatorial methods, whilst their exact sampling distributions are usually unknown. 
As related statistics we should mention here the moment ratios ts jb l and b 2 used in tests 
for deviation from normality (Geary, 1947, Geary & Worlledge, 1947). For these, the 
low-order moments are known exactly. A similar situation arises with statistics defined as 
likelihood ratios, as, for instance, with the criterion L x required for testing heterogeneity 
in a set of variances. Moments for this statistic were obtained by Neyman & Pearson as early 
as 1931, yet, although approximations to f(L x ) have been obtained (Bartlett, 1937; Hartley, 
1940; Nayer, 1936; Neyman & Pearson, 1931; Sukhatme, 1936; Welch, 1935, 1936), there 
is still considerable doubt about their accuracy in certain cases, and the exact formula 
obtained by Nair (1936) in the case of equal sample sizes is very complex. 

These and numerous other problems of distribution point to the necessity of developing 
a numerical technique to deal with the following situation: 

(i) A random variable x ranging between a and b (where a may be — qp and b may be + oo) 
has a distribution function f(x) known to have a continuous derivative of order n. 

(ii) The moments ?b 

x r f(x)dx (r = 0, 1, ..., iZ), (2) 

are known numerically to any decimal accuracy desired but for a limited number of positive 
integers/-, viz. r = 0, 1 , . . . , R. With the knowledge about f(x) limited to the above conditions, 

is it possible to obtain numerical values for the probability integral P(x) = f f(x)dx 



H. 0. Hartley anp S. H. Khamis 341 

depending on the moments only, and is it possible to make a statement on the accuracy of 
these values in terms of the derivatives of the function /(a;)? 

Problems of this kind have hitherto been treated principally in two ways: 

(a) When R = 2, 3 or 4 nothing better can be expected than a ‘good fit’, which is often 
achieved by fitting the appropriate Pearson-type curve. 

(b) With R in the neighbourhood of 5-8, expansions of the Gram Charlier, Laguerre or 
Jacobi type have been used, either as cumulant or as moment expansions. Such theorems 
as are available for statements on the convergence and asymptotic behaviour of these 
expansions usually require too many moments to be known. Often the expansions are only 
asymptotic, and unless the distribution is close to the generating curve (Normal for Gram 
Charlier, jP for Laguerre), the results are often disappointing (see, for example, Kendall, 
1945, Chapter 6). 


2. Outline of present method 

The method to be developed here is a direct application of finite-difference calculus and 
therefore provides both numerical answers to the problem, as well as gauges of their accuracy 
in form of remainder terms. The method is, in fact, closely linked with interpolation technique. 
When using any of the well-known interpolation formulae no mathematically rigorous 
statement on the accuracy of the interpolates can be made unless the magnitude of the 
remainder term can be estimated, and for this some knowledge about (say) the nth derivative 
of the function is required. Yet, in using such formulae the convergence of the difference 
table inspires confidence that ‘the results of the interpolation can be accepted as a working 
hypothesis’ (Milne Thomson, 1933, p. 62). Similarly, with the present method we shall give 
a numerical procedure of obtaining values of the probability integral. Certain checks of 
internal consistency will be described which inspire confidence that the answers are correct, 
but no rigorous statement on the accuracy can, of course, be made if this is to be based on 
a finite number of moments alone. The exact remainder terms which we derive will entail 
the high-order derivatives of f(x)> and it is hoped, in a second communication, to derive 
some general statements concerning their order of magnitude. 

In order to simplify the argument we assume in this section that the range of x is finite 
(a and b finite). 

The aim is to determine the probability integral of x> P(x) = J f(x)dx in tabular form, 
i.e. we wish to determine numerical values of 

P i =P(x i ) = jj(x)dx (3) 

for discrete values of x { . For convenience the group intervals x M — :r, will generally be chosen 
equidistant (group interval = h), and the number of intervals will be R 4* 1, i.e. equal to the 
number of given moments (including /i 0 = 1). Hence 

x { = a + ih, h = {b-a)/(R+ 1). (4) 

The first differences in the table derived from equation (3) are the quantities 

f^P.-P^^ [ Zx f{x)dx, (5) 

and are the familiar ‘frequencies’ /, : in a grouped frequency distribution with equidistant 
intervals (see Fig. 1). The link between those frequencies and the oxact moments [i r is then 



342 


A numerical solution of the problem of moments 

established by the well-known formulae for Sheppard’s correction. Using Kendall’s (1938, 
1945) derivation and remainder term, but extending his notation, we have 

fl+i 

Xfi& = tir+C(r,h) + S(r,h), ( 6 ) 

<-i 

where the centre points £« are given by 

• & = « + (•'-*)* (*« l,...,ii+l), (7) 

0(r, h) denotes Sheppard’s coiTective term, viz. 

f*n /h\v /r\ 1 

and S{r, h) the remainder term. 



The aim, now, is to use equations (6) to determine the unknown f t from the given // r . 
To this end the remainder 8(r , h) must be examined : Most distribution functions have what 
is commonly known as high contact at the terminals of the variate range. This means 
that f(x), as well as all its derivatives up to order, say, w, vanish at both ends of the range, i.e. 

fd\a) = /<*>(6) = 0 (i = 0, (9) 

If for such functions we define f(x) = 0 outside the range a ^ x ^ b, it will have continuous 
derivatives of up to order m for — oo<#< + oo. It can then be shown that the remainder 
term is of the form (see, for example, Kendall, 1945, p. 69) 

BJr,h) = - ^ R+ Jc h B m K m \r,h,6 r ) (m even), (10) 

S m (r, h) = J3»+i(i) * m \r, h, 0 r ) (m odd), (11) 

a<6 r ^b, 

where the B i are the Bernoulli numbers, the are the Bernoulli polynomials of first order, 
the integrand function k{r , h, x) is defined by 

k(r, Ti,x) ~ X? [ f(x + £,)d£, (12) 

J -ih 

and its derivatives with regard to x are denoted by # 7) . In the subsequent sections we shall 
assume (9) to hold (contact of order m), but will discuss the case when J9) is not satisfied 
in § 10. 

The remainder term $ m (r, h) will usually be small (see, for example, Kendall, 1945, p. 72). 
We shall therefore, in what follows, ignore S m (r,h) but will discuss the error thereby com- 
mitted in § 5. 

If, then, in (6) we omit S(r , h ) we obtain a system of R + 1 linear equations for the R+ 1 
unknowns f t r+i 

2 f& = /V + C(r, h) = /v (13) 



H. 0. Hartley and S. H. Khamis 


343 


The matrix of this system of equations (v B say) is of the form j | and has a classical deter- 
minant || £$ || > sometimes referred to as Vandermonde’s determinant and well known to be 
+ 0. The system can therefore be inverted onoe and for all and, for any particular case, the 
unknown /< can then be determined by substituting the right-hand sides of (13), i.e. Ji r in 
the inverse matrix v B l . Denoting the elements of this inverse matrix by u ir we have the 
system of equations r 

/< = 2 «<r/V (1*) 

r-0. 

i 

Progressive addition of the f t yields the P t from P } = £ f t * and therefore a table of P(x) at 

interval h. Finally, intermediate values of P(x) can be obtained by standard interpolation. 
Alternatively, as described in § 7, we may obtain directly a table of P(x) at interval \h. 


3. The standard form of the numerical inversion 

The rank of the original matrix v H is obviously equal to R + 1 , i.e. the number of moments 
given, whilst its elements are the powers of the centre points £j. It is desirable therefore that, 
for any given R, scale and location of the variable x be transformed into a standard form X, 
so that only one matrix V H and therefore only one matrix Vr 1 need be calculated for each R. 
It is most convenient to standardize as follows: 


X = (x - $(a + 6)) (R even), 

X = (*-i(a + &))£±i + i (I? odd). 


(15) 

( 10 ) 


It will be seen, therefore, that the range of X is R + 1 and the group interval 

H = X i+1 -X { = \. 

From the given moments of x those of X (A^ say) about X = 0 can, of course, be calculated 
by the usual binomial formulae, and in what follows we assume that values of are given 
numerically. Further, in analogy to (13), we have 

M r = M r + C(r, 1). (17) 

From (15) and (16) we obtain for the new centre points 

S t = - ..., 0, ..., + \R for even R, 

R - 1 A R+l . [ ( 18 ) 

*=<“ — g ’ ~T~ 


and the matrix V R becomes | (i — 1 — \R) r | or | (i — \R — {) r 
are given, we obtain for V % : 




for odd R , J 

Thus, if the first six moments 


1 

1 

1 

1 

1 

1 

1 

-3 

-2 

-1 

0 

1 

2 

3 

9 

4 

1 

0 

1 

4 

9 

— 27 

-8 

-1 

0 

1 

8 

27 

81 

16 

1 

0 

1 

16 

81 

— 243 

— 32 

-1 

0 

1 

32 

243 

729 

64 

1 

0 

1 

64 

729 


(19) 


* It is, of course, possible to construct a matrix yielding the P t directly from the ji r , but we are hero 
satisfied with determining the/j first, as they are of independent interest. 

Biometrika 34 


23 



344 


A numerical solution of the problem of moments 

In praotioe the important range of R will be from 5 to about 8. The inverse matrix V ^ 1 is 
given below, and it is hoped to give Ff \ Ff 4 1 and Vf 1 in a subsequent paper. The inverse 
matrix Fjf 1 , the elements of which are denoted by U ir , can be written in the form 

cJi « I U' ir M r , (20) 

r-0 

where U' iT — c H U ir , i.e. the c f are suitable common denominators of the U i0 , .... U iR , and the 
V' ir are given in the body of the schedule below: 




pH 

II 

o 

tel 

Bi 


^3 

M x 

m 5 


multiplier of column 

i 

«</< 

-4 

II 

o 

l 

2 

3 

4 

5 

6 


1 

720/, 

0 

-12 

4 

15 

-5 

-3 

1 


2 

120/, 

0 

18 

-9 

-20 

10 

2 

-1 


3 

48/, 

0 

-36 

36 

13 

-13 

-1 

1 

(21) 

4 

30/4 

36 

0 

-49 

0 

14 

0 

-1 

5 

48/ t 

0 

36 

36 

-13 

-13 

1 

1 


6 

120/, 

0 

-18 

-9 

20 

10 

— 2 

-1 


7 

720/, 

0 

12 

4 

- 15 

-5 

3 

1 



In order to use the above system of equations it would be necessary to compute the M r from 
the given M r , using formula (17). It is obviously more convenient to evaluate, once and for 
all, a matrix U" ir giving the f t directly in terms of the given M r . This matrix is given below 
for R = 6: 


i 

.fi 

* 

ii 

Mi 

m 2 

M a 



M % 

1 

fi 

0-000 379 

-0-011719 

0-002 344 

0-017 361 

-0-005 208 

-0-004167 

0-001 389 

2 

fz 

-0-005 227 

0-109 375 

-0-034 896 

-0-152778 

0-072 917 

0016 667 

- 0-008 333 

3 

/s 

0-059161 

-0-683 594 

0-618 490 

0-253 472 

-0-244 792 

-0-020 833 

0-020 833 

4 

/* 

0-891373 

0 

— 1*171 875 

0 

0-354167 

0 

-0-027 778 

5 

A 

0-059 161 

0-683 594 

0-618 490 

-0-253 472 

-0-244 792 

0-020833 

0-020 833 

6 

/. 

- 0-005 227 

-0-109 375 

-0-034 896 

0-152 778 

0-072917 

-0-016667 

- 0-008 333 

7 

ft 

0-000 379 

0-011719 

0-002 344 

-0-017 361 

-0-005 208 

0-004 167 

0-001 389 


(22) 


Working rule: Each f i is obtained by forming the sum of seven products using the seven coefficients in 
the ith line and applying them to M 0 , = 0-000 379M 0 — 0*011 119M 1 + ... +0-001 389ilf 6 . 

4. Calculation of the incomplete jS-functton 4(8, 6) 

FROM ITS FIRST SIX MOMENTS 

As an example for the above method we consider the Beta Distribution for p = 8 and 
9 = 6 ’ viz - f(x) = [B( 8, 6)] _1 a; 7 (l — a;) 6 . 

Using the moments for this distribution about x = 0, /i r = B(x + r, 6)/B(H. 6) (r — 0, 6) 
and transforming to the standard scale X — lx — 3-5, we obtain for the moments of X about 
X = 0: = 0-5, M 2 = 105, if 3 = 1-225, M t = 2-11426, M b = 4-41360 and M 6 = 10-56942. 

Substituting these in the matrix (22) we obtain values of f t whose progressive sums are 
showninTable 1 (calculated 7 X (8, 6)). These may be compared with the ‘exact ’ values obtained 
(by interpolation) from the Tables of the Incomplete B-function (1934). The worst discrepancy 
is about 2 in the fourth decimal. Higher accuracy can, of course, be obtained if the number 
of moments (12+1) and therefore the number of f t increases (see, for example, §8, where 
the normal curve is obtained to 5-decimal accuracy). 



H. 0. Hartley and S. H. Khamis 


345 


A rather gratifying feature of the comparison is the higher decimal accuracy in the tails 
of the distribution. Tins is a consequence of the sensitivity of the higher moments to changes 
in the tail frequencies. Note also that the elements in the top and bottom lines of the inverse 
matrix (22) are much smaller than those in the other lines, so that any error in the right-hand 
sides of (13) has a smaller effect on the terminal /*. 

Table 1 . Comparison of ‘ calculated 9 and ‘ exact 9 values of 4(8, 6) 



It might be argued that a further error will arise when determining intermediate values 
of 4 by interpolation in the ‘calculated’ table. This difficulty could, however, be overcome 
by shifting the grid of group intervals and using a standard X -scale with group end-points 
corresponding to the odd multiples of 1/14 in x , thereby obtaining 4 points half-way 
between the arguments of Table 1. Such a method has actually been used in § 7. 


5. The remainder term 

A formal representation of the remainder term is immediately obtained by reverting to the 
exact equations (6). If we are concerned with distribution functions having contact of order 
m at the terminals, the error contributions to the/* are obtained by substituting the R+l 
remainder terms 8 m (r,h) ((10, (11)) in the inverse matrix v~ l . It is convenient to use the 
standard variate X-scale, H = 1 and the V~ l matrix when it will be found that 

error/* = £ U ir S m (r , 1), (23) 

r-0 

where S m (r 9 1) is given by (10) or (11) putting h = 1 and remembering that the integrand 
function k must be taken in terms of the standard variate X , viz. 

Hr. 1, X) - <* + S>)^ («) 

Since the arguments 6 r of # m) (r, 1 , X ) are unknown it will as a rule be necessary to substitute 
their respective maxima in (23), at the same time taking | U ir | in place of U ir . 

Although with (23) we have given a formal solution of the error term involved, in a manner 
similar to the remainder terms of interpolation formulae, it will in practice be difficult to 
estimate the magnitude of the error from this formula. It is hoped, therefore, to go into this 
aspect more fully in a second paper. 


6. Infinite variate range and artificial truncation 

When the range of the variate is infinite, i.e. when a = — oo and/or b = + oo, it is, of course, 
possible to transform the variate x by, say, y = y(x) such that the range of y is finite. However, 
in general, we shall not be able to assume that the moments of y are known or that they can 


23*2 





346 


A numerical solution of the problem of moments 


be derived from those of x. It is therefore necessary to adapt our method to deal with an 
infinite variate range. We shall treat here the oase b — + oo, the case a = — oo being identical 
and the case a = — oo and b = + oo being analogous. 

For an infinite variate range, the condition of high oontact is now replaced by 


lim /<#(#) = 0 (* = 0,1, ...,m), 


(25) 


which results in remainder terms analogous to (10) and (11)*. Similarly, in equations (6) 
which correspond to Kendall’s (1945) equations (3*40), the summation now extends from 

/•*< 

i = 1 to i — oo, there being an infinity of frequencies f i = I f(x)dx. Now since the fi r 

J *i - 1 


exist we know that 
is convergent. Accordingly 


c 


affix) dx 


lim 2 (i~£) r ^ r f f( x ) dx = 0, 

6->co i=ft+2 J (i-l)h 


(20) 


(27) 


if h — (b — a)l(R + 1 ). If, therefore, we denote the above sums by e(r, 6) respectively we have, 
from (6), r+i 

2 fiZi + e(r,b) = fi r +C{r,h) + S(r,h). (28) 


i-1 


Applying now the previous method we introduce an additional error in the calculation of f {y 
but this error is smaller than + max | 6) | 2 | u ir | . 


The precise determination of the e(r, b) for any given b would, of course, require a knowledge 
of the nature of the convergence in (26), i.e. some external knowledge about the distribution 
f(x) which we are seeking to determine numerically. Unfortunately, such knowledge will 
in general not be available. 

However, if b is chosen sufficiently large, the f { determined for different values of b should 
all yield, by the method of §§ 2 and 3, approximations to the same probability integral P{x) 
to within the errors of the respective remainder terms 8(r , h) and to within the errors intro- 
duced by (27). In practice, therefore, one would make an intelligent guess at the likely 
range of b and then test for internal consistency by comparing the probability integral tables 
obtained by varying b over this range. This method, which is illustrated in § 7 gives an idea 
of the accuracy to which the integral has been determined, but no rigorous statement on 
accuracy can be made without appealing to some a priori knowledge about /(#). It is hoped 
to deal with this aspect more fully in the next paper. 


7. The calculation of the ^-distribution for 10 degrees of freedom 
As an illustration of the preceding section, we will now calculate the ^-distribution for 
10 degrees of freedom. This distribution has high contact at either terminal and, although it 
is known to start at x = x = we shall treat it as a distribution of double infinite range, i.e. 
we shall not make direct use of the information that f(x) — 0 for x < 0, and choose a truncated 
range a^x^b. 

We have a mean of fi x = 3*0843 2770, and the moments about the mean are given byf 
/4 = 0-4809223, /4 = 0-080 6720, = 0-713 2999, fi' 6 = 0-3866784, fi % = 1-8104865. 

* A formula for S(r, h) when the range is infinite will be given in the second paper. 

t These follow from the formulae for the moments about the origin which are ratios of JT-functions 
(see, for example, Kendall, 1945, p. 55), Note that we have used fi and fi' for moments about thfe 
origin and the mean, respectively. 



H. 0. Hartley and S. H. Khamis 347 

The standard deviation is = 0*7, and with seven group intervals available to cover 
the essential range we should choose h of the order of the standard deviation.* Our first 
attempt is, therefore, (a) h — 0-8. 

(a) If we make the mean of x the centre point of the innermost interval we have for the 
truncated range a = /t x — 3*5 x 0*8 = px~ 2-8 and b = ^ + 2-8. For the standard variate X, 
the origin X = 0 will ooinoide with the mean of x and its range will be — 3*5 ^ X < + 3*5. 
Calculation of the moments (Mr) of X and substitution in the matrix (22) yields the following 
answers for the frequencies/^ 

J x = 0*000 5, f 2 = 0*033 25, / s = 0*202 60, / 4 = 0*424 71, 

/ 6 = 0*231 90, /„ = 0*04200, / 7 = 0*004 87. 

The calculated frequency (/ 7 ) for the interval /i 1 + 2*0^x^/f 1 + 2*8 is about 0*005, and its 
contribution to ji' t about 0*005 (2*4)® ~ 1. Since this is an appreciable proportion of fi' t it is 
unlikely that the frequencies beyond b = /q + 2*8 when substituted in (27) can be neglected, 
i.e. b and h are too small. f 

(b) Choosing therefore a larger h, we try h = 1 . If we still keep the mean in the centre of 
the truncated range we have a = p x - 3*5 = —0*42 and b = ^ + 3*5 = 6*58 (we know, of 
course, that f(x ) = 0 for x — 0 so that our f x will really be the frequency for the interval 
0 ^ x $ 0*085). This time the standard variate is X — x— so that M, = /i T , and the above 
values can be substituted directly in the matrix (22) yielding the comparison of calculated 
^-integral and ‘exact’ ^-integral as shown in Table 2. 


Table 2. Comparison of calculated and exact values of the x-irdegral 


= X-7*1 

P(x ) exact 

P(x) calculated 

Difference 10” 6 

-2*5 

0*000 06 

0*000 11 

- 5 

-1*5 

0*009 29 

0-008 93 

36 

-0*5 

0-244 66 

0*244 75 

- 9 

4-0-5 

0-767 67 

0-767 85 

— 18 

4-1*6 

0*979 02 

0-978 88 

14 

+ 2-5 

0*999 45 

0-999 47 

- 2 

4-3*5 

1*000 00 

1-000 00 

0 


The maximum error is about 0*0004 and, again, the terminal f t have a higher decimal 
accuracy. In practice, of course, the exact distribution would not be available for com- 
parison. This time the terminal value / 7 is about 0*0005 and represents the frequency for 
the interval p x + 2*6 ^x^/i x + 3-5. Its contribution to p' t is about 0*4, thereby confirming 
that the previous grid of group intervals was too fine. To obtain further confirmation on 
the tail of the distribution, we determine a third set of/ 4 by shifting the grid of group intervals 
by 0*5 to the right, retaining the interval 1 = 1, This will make a — fa— Z and 6 = ^ + 4, 
i.e. 0*08 < x ^ 7*08. For our standard variate X the origin will now coincide with fa + 0*5. 

* An unsuitable choice of h would, later, fail to satisfy the checks of internal consistency, 
t Comparison with the exaot ^-distribution shows that the maximum error in the above f t is never- 
theless not more than 0*005. 




348 A numerical solution of the problem of moments 

The values of the are as follows: 

M 1 = — 0*5, M a =* 0-7369223, jlf 8 = -0-7747114, 

M t = + 1-344 8394, Jlf 5 = - 1-834 794, M t = 3-695 7606. 

Substituting these in the matrix (22) we obtain the following values of/*: 

/j = 0-000 16, / 8 = 0-070 12, / 8 = 0-44430, / 4 = 0-40421, 

/ 6 = 0-07699, / 6 = 0-004 19, / 7 = 0-00004. 

The eomparison of the progressive sums of the above /,■ with the exact ^-integral is of similar 
accuracy to that in Table 2. The terminal frequency for + 3 < * < /i v + 4 is 0-0006 with 
a contribution of about 0-03 to /t 8 , indicating that we have now reached a satisfactory 
choice of b. 

. As a final check on the internal consistency we compare the answers. obtained with the 
two last choices of group intervals by merging the tables of P(x) to obtain one table at 
interval 0-6. This is set out in Table 3. The differences provide a fair check on the internal 
consistency to about 3-decimal accuracy of the two separate tables. If a more reliable check 
is desired, three or even four separate tables may be computed, all at the same group interval 
' h and merged in the above manner to form a single table at interval or \h. This procedure 
has the added advantage that interpolation difficulties at the wide interval of h are being 
avoided. 

Table 3. x = y/or 10 degrees of freedom. Calculated table of P(x) obtained 
from t wo separate grids of group intervals (h = 1 ) 



P ( x ) 




-2-6 

00001 

1 



-2-0 

00002 

87 

86 

442 

-1*5 

0*0089 

015 

528 

601 

-1-0 

00704 


1129 




1744 

* 

- 174 

-0*5 

0*2448 

2699 

955 

-1122 

0 

0*5147 

2532 

- 107 

- 855 

0-5 

0*7079 

1510 

-1022 

112 

10 

0*9189 

000 

- 910 

480 

1-5 

0*9789 

170 

- 430 

296 

20 

0*9959 

36 

- 134 

103 

2*5 

0*9995 

5 

- 31 


3*0 

1*0000 



•i 


8. The special case of symmetrical distributions; the normal integral 

By placing the origin of the standard variate X at the mean of a symmetrical distribution 
we obviously have / x = fn+v ft = fu> etc -> i-e- the number of unknowns is halved. On the 
other hand, the odd moments contribute the meaningless equations 

2fi(&i + •S's+a-i) = £fi x 0 = 0. 



H. 0 . Hartley and S. H. Khamis 


349 


With the number of unknowns and equations halved and with even moments only retained, 
it is neoessary to work out a new matrix (V R say) based on even-order moments only. In 
practice the important values of R are R = 4, 6, 8 and 10, and we are giving below the inverse 
matrix V^ 1 (for R = 8) having rank 5 (as there are five equations corresponding to fi 0 , p a , 
fi 4 , /*„ and /i a ): 


t 

u 

M 0 = 1 


m 4 

M t 


1 

j i 

0*0003441 

-0*0017867 

0*001 2153 

-00002316 

0*0000124 

2 

/, 

— 0*003 9874 

0*020 8333 

-0*0137153 

00023148 

-0*0000868 

3 

/* 

0*0224151 

-0*119 0476 

0*071 1806 

-00081019 

0*0002480 

4 

h 

-0*088 4281 

0*5000000 

-0*1434028 

00127315 

-0*0003472 

5 

A 

0*5696503 

-0*4000000 

0*084 7222 

-00067130 

0*000 1736 


(29) 


Working rule: Each J L is obtained by forming the sums of five products using the five coefficients in 
the ith line find applying thorn to M 0 , . M % ; e.g.A = 0*000 3441.M 0 — ... + 0*000 0124Af g . 


As an example we compute the normal integral from its first five even moments, p 0 to /i S9 
choosing h = 1 and the standard variate X as normal deviate. Substituting, therefore, in the 
matrix (29) M 0 = 1 , ikf 2 = 1 , J/ 4 = 3, M 6 = 15 and — 105, we obtain the five f { which in 
Table 4 have been progressively added to form the ‘ calculated normal integral ’ to be com- 
pared with the ‘exact’ one. The accuracy is remarkable, the maximum error being 15 in 
the 6th decimal. 


Table 4. Comparison of calculated normal integral with exact normal integral 


II 

Exact P(x) 

Calculated P(x) 

Difference x 10“® 

-4 

0*000 032 

0*000 034 

- 2 

-3 

0001 360 

0001 342 

8 

-2 

0 022 750 

0*022 765 

-15 

-1 

0 158 655 

0 158 643 

12 

0 

0-500 000 

0*500 000 

0 


With symmetrical distributions we cannot, of course, shift the grid of group intervals, 
as otherwise we would lose the symmetry relation between the /*. If, therefore, intermediate 
values of P(a?) are required in order to ease subsequent interpolation, we can achieve this only 
by altering A. Merging the answers obtained from (say) three different h grids all centred 
at x = 0 (e.g. h — 0*9, 1*0 and 1*1), w r e would not obtain a table of P(x) at an equidistant 
interval. In the internal check wo w ould, therefore, use divided differences. 


9. Divergent or poorly convergent moments; the 

£ -DISTRIBUTION FOR 10 DEGREES OF FREEDOM 

Some variates with infinite range have distribution functions with low contact at x = oo, 
i.e. the convergence in lim f {x) = 0 (30) 

3*->00 

is slow, indeed, in some cases the moment p T is divergent for, say, r > R'. 

As an example we have investigated the ^-distribution for 10 degrees of freedom. Here we 
have/(x) = c(l + t 2 /10) -5 ' 6 and hence R' = 10. In this case, therefore, R' is known a priori. 
If no such mathematical information is available, warning of low contact is given by the rapid 



350 


A numerical solution of the problem of moments 

growth of the moments as r-*R, provided R is near to R'.* For our example for the 
^-distribution we find 

/t a = 1-25, fi K = 6-25, /i 6 = 78-125, /i 6 - 2734-375. 

The difficulty with suoh distributions is that artificial truncation is not justified if the 
high-order, poorly convergent moments are to be used in equations (6). The remedy in suoh 
cases is the square variate transformation y i — x. Sometimes it may be necessary to use a 
higher power y* = x. Obviously, if we were to take an equidistant interval for y, the group 
integral for x will grow with the square law, thereby absorbing the slowly convergent tail 
end of f(x). 

Now, obviously, the moments of y are simplj- related to those of x\ we have 

f X r f(x)dx= 2 1* y*yf(y 2 )dy, (31) 

Jo Jo 

or introducing the new distribution function g(y) = 2 y/(y a ), we have 

f ^/(xjda: = f y*g(y)dy. (32) 

Jo Jo 

Applying now the previous method to g(y) it is further necessary to avoid using the poorly 
convergent high-order moments. In the case of the ^-distribution, instead of taking 
r = 0, 2, 4, 6 and 8, we take the absolute moments*)* for r = 0, 1, 2, 3 and 4, which, according 
to (32), correspond to the even moment of g(y). If only even moments about the origin are 
used in the determination of the f it the matrix (29) gives the appropriate inversion. Using 
h = 0-6 for the y-group interval we substitute in (29): 

Jf 0 =l, M 2 = 2-401 906, M a = 9-645062, M s = 52-952 032 and Jf 8 = 372*108 863. 
We thereby obtain five values off { (i = 1 , ..., 5) of the form 

/•(«-<)*** 

/i= 9(y)dy = \ f(x)dx. (33) 

J (5 ~i)h J <5-0 * 

The progressive sums of these are compared with the corresponding values of the exact 
^-integral in Table 5. Although the accuracy is lower than in the previous example it is 
satisfactory and very much better than we could have obtained without applying the trans- 
formation y 2 = x . 


Table 5. Comparison of calculated and exact values of the t-integral 


t 

P(t) exact 

P(t) calculated 

Difference x 10~ 4 

<5-70 

0-0001 

0*0001 

0 

3-24 

0*0044 

0*0042 

2 

1-44 

0*0902 

0*0905 

- 3 

0-30 

0*3013 

0*3048 

— 35 

0-00 

0*5000 

0*5000 

0 


* If R is much smaller than R\ the present difficulty will not arise at all. 

t We shall show in a second paper that, if the absolute moments of a distribution are not known, they 
can be obtained by interpolation between the values of log/t r for r = 2, 4, 6, 8, etc.; in fact, we shall give 
a general discussion of the interpolability of the logarithmio moment function for positive x . 




H, O. Hartley and S. H. Khamis 


351 


10. Laok of high contact at the start of the variate x=a 

We confine ourselves here to the most important case of lack of high contact at one terminal, 
say the start of the distribution, and assume, therefore, that there is high contact at one end 
of the range . 

Without loss of generality we assume that a = 0, i.e. x > 0, and introduce the new variate 
y k * x 9 k ^ 2. Whence we have 

rb /•&!/* 

aff(x)dx = \ y kr g(y)dy, (34) 

Jo Jo 

where g(y) = ky k ~ l f(y k ). Obviously g(y) has, at least, contact of order k — 1 at the start y = 0; 
further, if f(x) has contact of order m at x = 6, i.e. if f(x) = 0(xr m ) at x = 6, then 
g(y) = 0(y ~( m ~ 1) 1 ) . Hence there is high-order contact, of order k—l and (m — l)k+ 1, 
respectively, at both ends of the range. The previous method is therefore applicable to g(y) 
provided we can obtain its moments from those of f(x). It is obvious from (34) that in order 
to obtain the ordinary moment of g(y) we require to know the 6 fractional ’ moments of f(x), 
i.e. those corresponding to r = jjk (j = 0, 1, ...). If the moments of f(x) are only known for 
integer r the fractional moments will have to be obtained by interpolation of the logarithmic 
moment function log jj t which will be more fully discussed in the next paper. 

REFERENCES 

Babtlett, M. S. (1937). Proc. Roy . Soc. A, 160, 268. 

Fisher, R. A. (1929). Proc . Lond. Math . Soc. (2), 30, 199. 

Fisher, R. A. (1930). Proc. Roy. Soc. A, 130, 16. 

Geary, R. C. (1947). Biometrika , 34, 68. 

Geary, R. C. & Worlledge, J. P. G. (1947). Biometrika , 34, 98. 

HARTiiEY, H. O. (1940). Biometrika , 31, 249. 

Kendaix, M. G. (1938). J.R. statist. Soc. 101, 692. 

Kendall, M. G. (1946). The Advanced Theory of Statistics , 1. London: C. Griffin and Co. 

Milne Thomson, L. M. (1933). The Calvulus of Finite Differences. London: Macmillan. 

.Nair, U. S. (1936). Biometrika , 30, 274. 

Nayer, P. P. N. (1936). Statist . Res. Mem . 1, 38. 

Neyman, J. & Pearson, E. S. (1931). Bull. int. Acad. Cracovie , A, p. 460. 

Sukhatme, P. V. (1936). Statist. Res. Mem. 1, 94. 

Tables of the Incomplete Beta-Function , edited by Karl Pearson (1934). London: Biometrika Office. 
Welch, B. L. (1936). Biometrika , 27, 146. 

Welch, B. L. (1936). Statist. Res. Mem. 1, 1. 



[ 352 ] 


APPROXIMATION TO PERCENTAGE POINTS OF 
THE ^-DISTRIBUTION 


By A. H. CARTER, King's College , Cambridge 


Tables have been published of the values of z for various percentage levels (20, 5, 1 and 0*1%) 
for a range of given n v n 2 (Fisher & Yates, 1943, Table V). Where % or n 2 is outside the range 
of the tables, recourse must be had to approximate formulae (unless, of course, interpolation 
is sufficiently accurate) which will combine accuracy with facility of computation. One 
such formula, due to Fisher, with a modification suggested by Cochran (1940), is given at 
the foot of the above-mentioned tables. The purpose of this paper is to derive an alternative 
formula, no more difficult to compute, which will be shown to give consistently closer 
approximations to the true value of z for all except small n x or n 2 . 

Wishart (1947) has derived formulae for the exact cumulants of z, and also the well-known 
approximations to them when n v n 2 are large. The exact cumulants as far as K b can be readily 
obtained arithmetically from tables of the Polygamma functions. Knowing the cumulants 
of the distribution, we may make use of the Cornish-Fisher normalization function method, 
based on Edgeworth’s form of the Gram-Charlier type A series (Cornish & Fisher, 1937), to 
approximate to the percentage points. The method consists in writing z as an expansion 
in powers of a corresponding normal variate, £, the coefficients being functions of the 
cumulants of z, and assumes that K r is of order n l ~ r , which is true for the ^-distribution 
(Wishart, 1947, p. 172). 

If z and £ are expressed in standard measure (i.e. mean zero, standard deviation unity) 
we then derive 2 _^' k 3 £ 2 -1 k 4 £ 3 - 3£ *§2£ 3 -5£ 

a * ~ fe + <r 3 8 + <r 4 24 ^ 36 ’ 


correct to order w -1 for 2 '. This gives 

„ , , x , _ /c.£ 2 -l k 4 £ 3 -3£ 

Formula (a ) : z~/* 1 + rr£+-- — g— + ^ — ~ 


k§2£ 3 -5£ 
36 ’ 


(1) 


correct to order (since <r — 0(?i~*)), where /i[( = & x ), <x 2 ( = k 2 ), /c 3 , k a are cumulants of 
the z-distribution. The ^-coefficients may be readily computed: e.g. for the 5 % level, sub- 
stitute £ = 1-64485. Table 2 gives the values, for the 20, 5, 1 and 0- 1 % levels, of the coefficients 
required in applying the formula. The quantities /i[, cr , x 3 , k a depend of course on n x and n 2 , 
and may be evaluated in any particular case, whence substitution in (1) gives the appropriate 
value of z. Since | z x ^ P (n v n 2 ) | = | z P (n 2 ,%) |, where z P is the value of z corresponding to 
probability P, to find the percentage points for the ‘negative tail \ i.e. 80, 95, 99 and 99*9 %, 
we may simply interchange n x and n 2 . This has the effect of changing the sign of the odd 
cumulants, so that in (1) we write — and — /c 3 for /i[ and x 3 . 

Formula (a), being an approximation to order n~ *, may be expected to give reliable results 
when and n 2 are both large. For the 1 % point, for example, we find z(6, 12) = 0-7843 
(true value 0-7864), whereas z(24, 60) = 0-3744 (true value 0-3746). Some further results for 
(6, 12), (6, 60), and (24, 60) are shown in Table 1 (a). 

In practice, some labour is involved in applying formula (a), even if polygamma tables 
are available. The Fisher-Cochran formula, derived by the normalization function method, 
is a simple working approximation, valid for large n v n 2i in which the exact cumulants are 
replaced by their approximations in terms of inverse powers of n x and n 2 . 



A. H. Carter 


353 


Table 1 . Comparison of approximations to the percentage points of z 

Formula (6): Existing formula (Fisher-Coehran). 

Formula (c) : New formula. 



0-2687 

0*2733 

0-2706 

0-6607 

0-6601 

0-6487 

0-7992 

0-7886 

0 - 7864 

1 - 1074 
1-0693 
1-0628 


24 , 60 20 , 36 20 , 100 36 , 60 


0-1901 

0-2020 

0-1965 

0-3990 

0-4100 


0-6646 

0-6698 

0-6687 

0-7474 

0-7372 

0-7377 


0-1335 

0-1340 

0-1338 

0-2660 

0-2664 


0-4064 i 0-2664 


0-3746 

0-3746 

0-3746 

0-4963 

0-4954 

0-4955 


0-1577 

0-1580 

0-1579 

0-3128 

0-3129 

0-3129 

0-4441 

0-4435 

0-4435 

0-5928 

0-5906 

0-5905 


0-1287 

0-1298 

0-1294 

0-2573 

0-2586 

0-2583 

0-3619 

0-3629 

0-3630 

0-4755 

0-4756 

0-4760 


0-1212 0-1741 

0-1213 0-1740 

0-1213 0-1740 


0-2390 

0-2391 

0-2391 


0-3426 

0-3426 

0-3426 


0-3385 > 0-4894 
0-3384 ; 0-4893 
0-3384 I 0-4890 


O-4603 0-6602 

0-4498 0-6595 

0-4498 0*6589 


36 , 36 


0-1415 

0-1415 

0-1415 

0-2778 

0-2778 

0-2778 

0-3955 

0-3955 

0*3954 

0-5307 

0-5304 

0-5302 


60 , 24 36 , 20 100 , 20 60 , 36 


Formula (6) 
Formula (c) 
True z 


0-3509 

0-3506 

0-3510 


0-3346 

0-3408 

0-3388 


0-1566 

0-1569 

0-1568 


0-1783 

0-1785 

0-1784 


0-1656 

0-1665 

0-1661 


0-1314 

0-1315 

0-1315 


Formula ( b ) 
Formula (c) 
True z 


0-6884 

0-7001 

0-6931 


0-6435 

0-6706 

0-6596 


0-3047 

0-3060 

0-3055 


0-3483 

0-3493 

0-3488 


0-3208 

0-3236 

0-3227 


0-2566 

0-2570 

0-2568 


Formula (6) 
Formula (c) 
True z 


1-0120 

1-0370 

1-0218 


0-9444 

0-9956 

0-9770 


0-4368 

0-4391 

0-4385 


0-4995 

0-5016 

0-5009 


0-4615 

0-4662 

0*4666 


0-3661 

0-3667 

0-3666 


Formula (6) 
Formula (c) 
True z 


1-4352 

1-4681 

1-4449 


1-3340 

1-4155 

1-3929 


0-5930 

0-5965 

0-5962 


0-6789 

0-6820 

0-6814 


0-6303 

0-6375 

0-6371 


0-4932 

0-4942 

0-4940 


Table 1 (a). Some mines of z from formula (a) (exact cumulant formula) 

(For corresponding true values, see Table 1 ) 


^ 1 , 

0 / 

/o 

n t 

20 


5 


1 


0-1 



0-5457 

0-7843 


6 , 60 

24 , 60 

0-1998 

0-4022 

0-1338 

0-2652 

0-5640 

0-7433 

0-3744 

0-4956 


0-3499 

0-6958 


0-3335 

0-6627 


0 - 9854 

1 - 4026 


0-4388 

0-5966 












Approximation to percentage points of the z-cUstribution 


The cumulant function of z is 

K(z) = \u log ^ + log / j- + log - log - log , 

and the cumulants are obtainable by differentiating this successively with respect to (it), 
at each stage putting t — 0. 

Staoe [i|? log/ ’( 1 f i ‘)],.„ = (±1), it lo 8 r (l)' 

and log can be expanded by Stirling’s theorem in inverse powers of n, the cumulants 
may also readily be expressed in inverse powers of n v n t \ and, when n v n t are reasonably 
large, the first few terms only in the expansions will give sufficiently close approximations 

to them. In fact, writing 11 11 

+ -• = s, ----- = d, 


it has been shown that /q = — — Id - 6 ad + 0(n *), 

Xj = <7* = Js + ^(s 2 + d 2 ) + 0(ir 3 ) 

x s = — $ad + 0(n~ s ), [ (A) 

x 4 = £«(«* + 3d 2 ) + 6>(«-‘), 

x r = 0(n l ~ r ) (r> 1). 

Formula (a) will now have an extra term, since we take as our ‘working variance’ of z 
not its exact value, but its approximation to order n 1 from (A), i.e. £a. In the notation of 
Kendall (1945) ; 2 = k 2 /^s — 1 ~ |(a + d 2 js). 

We then obtain 

2 -^~y(^-5 <£, - i)+ /2fe (£,+sj,+ ^ (5,+iu) ) <2) 


jl e±»l 

2/a 12 / 6 VS 


,)+ rkJb? +u() 




where 


h 2 i £ 2 + 3 

h- s , A-—, 


provided T J l d*. v /(2/«)(£ 3 + ll£) may be neglected (which will be the case for small d). 
Inserting the approximation to ju{ from (A), i.e. - fat, 

2 ~VTA^Aj-5 ,{ ‘ +2) ’ ,3 “ ) 

the Fisher-Cochran formula, which has, in fact, been found to give a fairly plose approxima- 
tion to the true z for n v n a both reasonably large. It may be noted that if n v n a are not very 
large, an improvement will be effected by including the second term in the estimate of the 
mean (xj, i.e. from (A) by adding — \sd. 

For (n x , n t ) = (6, 12) this correction is —0-00347, and for (24,60), the correction is 
— 0-00024. Inserting this improved approximation to fi' x in (3) we have 

i ~V(^X)'5 ,£ ‘ +2+ ' ) - 


Formula (6): 


(36) 



A. H. Carter 356 

As pointed out by Wishart (1947, p. 179), an approximation to the value of any K r (r > 1) 
obtained by considering its leading term only, will be improved by writing l/(» x — 1) and 
1 K n t~ 1) ia place of 1/% and l/n a . For, by Stirling’s expansion of a factorial, 

iog r(|) - ^ !og | - 1+ J log i 
in ' ,og r (i) ' 2n + 2n> + 3n i + 0< “ "** 

and so on. 


Table 2. ^-coefficients required in applying formula (a) 



20% 

5 % 

1 % 

01 % 

f 

0-84102 

1-64486 

2-32636 

3-09023 

i(£*-D 

-004861 

0-28426 

0-73632 

1-42492 

*(£ 8 -3g) 

-0-08036 

-0-02018 

0-23379 

0-84332 

*(2g*-«C) 

-0-08377 

0-01878 

0-37034 

1-21020 


Thus writing 


1 , 1 1 

■ + ~ d = “ 


(B) 


1 W<2 1 1 Wj “**■ 1 

we might expect a better approximation to z to be obtained, corresponding .to that of Fisher 
and Cochran, if we use s' and d' instead of 8 and d. 

Corresponding to equations (A) we have 

/c 3 = — £«'d' 4- 0(n~ A )y 

K i = {*% 8' 2 + 3d' 2 ) + 0(n- b ), 

K r = 0(n l ~ r ) (r>l). 

For the mean, however, p[ = at a — — Jd - Jsd 4- 0(?i 4 ), (4) 

= - Jd' 4- £*'d' 4- 0(n~ 3 ), (4a) 

If w“ 3 is not negligible (relative to the degree of accuracy desired) p\ should therefore be left 
in the form — \d—\sd. 

Proceeding as before, we obtain 


z — p i~ 


1 )+«- 




s' ) 144 ’ 


whence 


* A') 6 ^ 1)1 

A' -2/s', A' = *(£*- 3). 


( 5 ) 

( 6 ) 


where 



356 


. Approximation to percentage points of the z-distribvtion 

Since this is based in the first place on more accurate approximations to the cumulants 
K a and K t , and since the term omitted from (5) in deriving (6) (i.e. T J T d' > ^/(2/s')(g 8 — 7£), 
is evidently numerically less than the corresponding term omitted in obtaining the Fisher- 
Cochran formula (i.e. ^J(2/s) (| 3 + 1 l£), formula (6) might be expected to give an 

improved approximation to z. In fact, however, it does not, and the reason is not 
far to seek. 

Consider the expansion of ij^(h — A) in both cases: 


V(fc-A) 




, A 3A* 

1 + 2A + iP + 




where the terms are decreasing in magnitude (since 1/A = = 0(n~~ 1 )). Henoe the error 

in neglecting all terms after the second will be approximately of the order of the third term. 

3A 2£ 

Now in the Fisher- Cochran approximation this term, has the same sign as the 

omitted term T fad 2 <J( 2/s ) (£ 3 -f 1 1£) (both being of the same sign as £), so that the extra terms 
included will tend to compensate for the term omitted. In obtaining (6) from (5), on the 
other hand, the term omitted, j^d' 2 ^{2/s') (£ 3 — 7£), will be of opposite sign to £ when 
| £ | < <v / 7, corresponding to a probability of about 0*004: so that for most percentage levels 
encountered in practice, the error in (5) is increased in (6). 

A better formula is obtained from (5) as 




(’) 


where A' and A' are as in (6). 

Expanding £V(A' + A')/A', it is found that the third term is now of opposite sign to £, and 
hence the extra terms contained in the expansion will tend to compensate for the term 
omitted. Since s' and d' require to be calculated in applying this formula, it is desirable to 
write p[ in the form (4a) (provided we can neglect quantities of order n~ 3 ). This gives 


Formula (c): 




Oa) 


Collecting the results, we have the three approximate formulae: 
Formula (a) (exact cumulant method): 


z ~ k x -f <r£ + 


*3 £ 2 — i . *4 £ 8 — 3 £ 
cr 2 6 cr 3 24 


Kf2£ 3 -5£ 

cr 5 36 * 


Formula (6) (Fisher- Cochran formula): 


where 




1 1 

S = - + ~> 


n , 


n a 


£ 

W- A) 


-*(?+2 + s), 



£ a + 3 
6 ' 


Formula (c) (new formula): z 




s = 


1 1 

^ , 

n x — 1 n 2 - 1 



1 


n 2 -r 



?rJ 

6 


where 



A. H. Carter 


357 


When n v n t are large (and not too different), is negligible and formula ( b ) becomes the 
formula more generally quoted 


which may be written 


, L_ 

V(A-A) 



$ 2 + 2 
6 


(A-i). 


Similarly, for sufficiently large n v n 2 , \8'd f may be neglected and formula (c) becomes 



,„W<*'+ A') / 

' ,L_ 1 1 

|^ + 2 


' h' \ 

— 1 ^2 — 1 / 

1 6 ’ 

or 

z Mi*± *Lj 

h' \ 

f .. 1 _..U 

»,-l; 



It is to be noted, however, that since \8'd f is approximately twice \sd, more care must be 
exercised in deciding to neglect it. For example, when (n v n 2 ) is (20, 100), Ja'd' = 0-0009, 
and for (24, 60), its value is 0-0005. 

For purposes of comparison, values of 2 have been computed from formulae ( b ) and (c), 
for the four common percentage levels, over a fairly wide range of n v n 2 . They are shown in 
Table 1, together with the corresponding true values of z. The latter were obtained where 
possible from the tables of Fisher and Yates: elsewhere by inverse interpolation in Tables of 
the Incomplete Beta-Function followed by a logarithmic transformation. Such values are in 
error by not more than 0-0001. It will be seen that neither formula yields very accurate 
results when n x or n 2 is as small as 6, though even here the new formula is rather better with 
the single exception of n x = 12, n 2 = 6. In actual practice, however, we are concerned with 
large values of n v n 2l beyond the range of the published tables. Considering only those cases 
where n x and n 2 are both greater than 20, it is seen that formula (c) gives a consistently closer 
approximation than does formula (b) for both the positive and the negative tails, and for 
all the percentage levels investigated, though its relative gain in accuracy is greatest at the 
1 and 0-1 % levels. It may be noted, in fact, that in no case considered having n x and n 2 
greater than 20, is the error more than 9 in the fourth decimal place, i.e. it appears 
that for all except small n v n 2i this formula will give an approximation to z correct to 
within 0-001. 

In conclusion, therefore, it is recommended that formula (c) be adopted for general use, 
since it is no more difficult to compute, and is more accurate, than the existing formula. 
Dropping the dashes we have the formula 


where 


ZJ(h + A) 
h 




1 1.2 

s — H h — 

n x — 1 n 2 — 1 s 


P-3 


A - * 


or ' if 5 ((»-“!)->- may be neglecto< *' 



358 Approximation to percentage points of the z-distribvtion 

The values of f and A for the four percentage levels are: 



20% 

5% 

i% 

01%. 

i 

0*8410 

1*0449 

2*3203 

3 0902 

A 

-0*3819 

0*0491 

0*4020 

1*0910 


My thanks are due to Dr J. Wishart, whose suggestion was the basis of this paper. 


REFERENCES 

Cochran, W. G. (1940). Note on an approximative formula for significance levels of z. Ann. Math . 
Statist. 11, 93. 

Cornish, E. A, & Fisher, R. A. (1937). Moments and cumulants in the specification of distributions. 
Rev. Inst. Int. Statist. 4, 307. 

Fisher, R. A. & Yates, F. (1943). Statistical Tables , 2nd ed. Oliver and Boyd. 

Kendaia., M. G. (1945). Advanced Theory of Statistics, I, 150, 2nd ed. C. Griffin and Co. 

Wishart, J. (1947). The cumulants of the z and of the logarithmic x 8 ^ t distributions. Biometrika , 
34, 170. 




[ 369 ] 


MISCELLANEA 


Note on the cumulants of Fisher’s r-distribution 


By LEO A, AROIAN, Hunter College 

In a recent article Dr J. Wishart ( 1947 ) stated : ‘ Explicit expressions for the exact cumulants of Fisher’s 
z - distribution do not appear ever to have been published. ’ Fisher’s 2 -distribution and the related Snede- 
oor’s F-distribution formed a part of my doctor’s thesis and rather full results concerning the cumulants 
of the 2 -distribution and other properties of the distribution were published in the Annate of Mathe- 
matical Statistics (Aroian, 1941) some time ago.* I should like to take this opportunity of adding certain 
comments on the Gram-Charlier Type A approximation to the 2 -distribution and the type III approxima- 
tion to the F-distribution. 

To obtain the cumulants of the 2 -distribution I expanded the moment generating function M t (0) in 
powers of 6 and found A*.*, the kth semi-variant (or cumulant) of 2 as the coefficient of O k fk !. The exact 
results correspond with Wishart ’s formulae (9) to ( 15), although given in a different form, and need not 
be repeated here. In addition, asymptotic formulae for A fc;g , n x and n a large, were derived by means of 
the Euler-Maclaurin sum formula. Furthermore, another type of formula could have been given for 
n 1 small but n t large, merely by expanding that part of A*. £ in which n 9 occurs by the Euler-Maclaurin 
sum formula. The special cases for the logarithmic.#*, the logarithmic t y and the logarithmic normal 
probability functions follow by substituting the proper limiting values of n x and n 2 . 

In my previous paper I was overcautious concerning the type A approximation to the 2 -distribution. 
Actually the method is fairly accurate although tedious. Taking 


F(t) — fi{t) 4 * A 9 (^{t) 4 - A 4 if> iv (t) f f F(t)dt — tj , 

* to 

we have f F{t)dt=[ </>(t) d 1 + 1) + A 4 (*jJ- 3# 0 )}, 

J t 0 J t 0 

whore 7} is usually 0-10, 0-05, 0*025, 0*01, etc. As an example take n x = 24, n % = 60; then 
A l: * = —0*0127429, rr x = 0*173779, A 3;£ = -0*0007998, \ 4:z = 0*0000867, 

A t = = 0025346, *4. = = 000396. 

3! a? * 4!<r; 

to for 7f = 0*06 is 1*60094, 2 *^ = 0*26547 against the accurate value of 0*26534. For the 1 % pointy 
^ = 2*2338, 2 ^ = 0*3764 against the accurate value of 0*3746. When n x = n t = 24, Zo. 06 = 0*3423 
against the accurate value of 0*3425. 

The type III approximation to the F -distribution is of some interest since for n x moderate and n s 
large, n x F tends to be distributed as #* with n x degrees of freedom. Since 


Mean F = F = 


i| — 2* 




= / 
— 2 v 


Otm.m = 


4(2^ + n t — 


n i( n t 


I 

-6) V 


n t (n,-4) 
2(n l + n,— 2)’ 


2(w 1 + w 1 — 2) 


a »:7 — 


we find the 5, 1 or 0-1 % points for F by using 

= F + <7^ 1-64485 + 0-28392a,., — 0-04902a|.,), 
Fo-n = F + (7,(2-32635 + 0-7 3330a,., - 0-0249570&,), 
Ktot = ^+0V(3-O9O3 + l-419Oa, : , + O-O5667o&,). 


• [Both Dr Wishart, as author, and myself as editor regret that owing to wartime preoccupation the 
publication of Dr Aroian’s 1941 paper was overlooked. E.S.P.] 

Biometrika 34 24 



360 


Miscellanea 


These formulae for the levels of significance of the x* distribution are from a previous paper (Aroian, 
1943). For n x = 24, n t = 60, Fq . q8 by this approximation is 1*709 compared with the accurate value 
of 1*700. For n x = 24, n, = 100, F^ by this approximation is 1*631 against the accurate value of 1*627. 
For n x = n f = 100, Fq. m by this approximation is 1*394 as compared with the accurate value of 1*392. 
While these results are not too poor, they are not so accurate as the well-known formulae of Cochran - 
Fisher or of E. Paulson (1942) which, for large values of n x and n,, generally give 4 significant figures* 

REFERENCES 

Aroian, L. A. (1941). Ann . Math. Statist. 12, 429. 

Aroian, L. A. (1943). Ann . Math . Statist. 14, 93. 

Paulson, E. (1942). Ann. Math. Statist. 13, 233. 

Wishart, J. (1947). Biometrika, 34, 170. 


A note on the mean deviation from the median 


By K. R. NAIR 


For samples drawn from a normal universe, Godwin (1946) obtained the sampling distribution of the 
mean deviation when the individual deviations are measured from the sample mean. It is well known that 
the mean deviation is least when it is measured from the sample median .♦ Let us refer to them as 4 mean 
deviation from mean’ and ‘mean deviation from median' respectively, and use the letters m and m' to 
denote their sample estimates. 

The exact sampling distribution of m being now known and its probability integral tabulated, the 
question may well be asked what the distribution of m' is. Since m' < ra, their expectations have the 
same relationship *(»')**(»). (!) 


For samples of n from a normal population with standard deviation, or, E(m') as /„V and E(m) = / n <r, 
where /'</„. For getting unbiased estimates of cr we should divide m' by f' n and m by f n . What wo are 
now interested to know is which of the two estimates has a smaller standard error. In the case of m, it 
has been shown by Helmert (1876) and Fisher (1920) that 




2(n-l) 


nn 


and 


s - e - ° f ($ = ^y[i +v[n(n " 2)]_w+8in " i «- j] • 


(2) 

(3) 


In the case of m\ we neither know /' nor the standard error of (m'//«) for samples of size n. 

It iH obvious that when n is very large, the mean and median will differ very little from one another 
and hence m' ->m and f n It is interesting to note that, at the other end of the scale, namely, when 
n = 2, m and m' are identical, and equal to one-half the sample range. 

To discover any real difference that may exist between the standard errors of (m/f n ) and ( ), 
which is the same as determining the difference between the coefficients of variation of m and m', we must 
consider samples of size greater than 2. 

(i) Let us take n = 3, and let x x , x % , x 8 be the observed values arranged in order of ascending magnitude. 
We at once find that , . . 

(4) 


The distribution of m' for samples of 3 is therefore derivable from that of the ronge. The probability 
integral of the range has been tabulated by Pearson & Hartley ( 1942) for n = 2 to 20. For our purpose 
it is necessary only to know the values of the mean range (id) and the standard error of the range (<r w ) 


* When n, the sample size, is an odd number, the sample median is by definition the value of the 

i(n+ l)th ranked observation. When n is even, the sample median is conventionally taken as the mean 
of the Jnth and \(n + 2) th ranked values. The mean deviation from the median will have the same 
magnitude whatever value, between the Jnth and J(n + 2)th ranked values, the median takes, when 
n is even. No complication is therefore introduced by accepting the conventional definition of the 
median for even -sized samples. 



Miscellanea 


361 


for samples of 3. This can be calculated, correct to six decimal places, from certain numerical values 
given by Pearson (1926). Using his figures, 


w = 1-692568X0-, 
a w = 0-888368 x <r . 

The value of/' for sample of 3 is, therefore, 

./,' = ix 1-692568 = 0-66419, 
0-888368 


and the standard error of m'// 8 is 


-<r = 0-52486 cr, 


( 5 ) 

(6) 


(7) 


1-692668 

correct to live decimal places. 

The corresponding values for / 8 and standard error of (m// 8 ) are obtained by putting n = 3 in 
equations (2) and (3) and are , A 

Ws— *“• < 8 ’ 


and 


s.E. of (m// 8 ) =s ^ J + ^3 - 3 j = 0-52486cr, 


(9) 


correct to five decimal places. 

Although (9) can be evaluated to any number of decimal places, we are not in a position to bring (7) 
to a higher order of accuracy than five decimal places. It is very unlikely that (7) and (9) are absolutely 
identical, but we may safely conclude that they are practically the same. 

(ii) We next come to samples of 4. If x lt x % , x 8 , x K be the observations arrangod in order of ascending 
magnitude, the mean deviation from median is given by 


m' = J(a? 4 + x 8 — :r a — a?i). 


( 10 ) 


The distribution of m' follows immediately from ‘some order statistic distributions for samples of 
size 4’ obtained by Walsh (1946) and is as follows: 

12 / f 4m ' \ 2 

p(m')dm' = u/(2ff)]8 e_2m ' a (J o e~ iv% dyj dm'. (11) 

The probability integral of m' is given by 

rm / i»4m' 1 \ 3 / 2 C 2m ’ « \ a 

- J a > (J o ') * (^J, *) ■ < 12 

The values of P(m') given by (12) can easily be evaluated using the normal probability integral table 
and are given in cols. (3) and (6) of the table below, alongside corresponding values (given in cols. (2) 
and (5)) for the probability integral of the mean deviation (m) from the mean, for samples of 4, copied 
from Godwin’s (1945) tables. 


Table giving the probability integral of the mean deviation from (a) mean and (b) tnedian 
for samples of four observations from a normal universe (cr = 1) 


rn (or m') 

P(m) 

P(m') 

m (or rn') 

P(m) 

P(m') 

0-0 

0 00000 

0-00000 

1-3 

0-96758 

0-97229 

0-1 

000333 

0-00398 

1-4 

0-98229 

0-98475 

0-2 

0-02534 

0- 03003 

1-5 

0-99073 

0-99192 

0-3 

0-07879 

0-09204 

1-6 

0-99534 

0-99588 

0-4 

0-16693 

0-19139 

1-7 

0*99775 

0-99798 

0-5 

0-28345 

0-31818 

1-8 

0-99895 

0-99905 

0-6 

0-41552 

0-45629 

1-9 

0-99953 

0-99957 

0-7 

0-54836 

0-58951 

2-0 

0-99980 

0-99981 

0-8 

0-66934 

0-70592 

2-1 

0-99992 

0-99992 

0-9 

0-77040 

0-79954 

2-2 

0-99997 

0-99997 

1-0 

0-84860 

0-86962 

2-3 

0-99999 

0-99999 

1-1 

0-90502 

0-91888 

2-4 

1-00000 

1-00000 

1-2 

0*94321 

0-95162 





24*3 



362 


Miscellanea 


We note that although m and m' have an infinite range from 0 to oo, their probability integrals rapidly 
approach unity, this value being reached to five decimal plaoe accuracy when m(m') = 2* 4c*. We can 
approximately work out the moments of the two distributions from the table above. The values of the 
mean and the standard deviation (applying Sheppard’s correction for grouping) of m and mf so obtained 
are given below: 


Mean: 

Standard deviation: 
Coefficient of variation: 


m as 0*690986<r, 
<r m = 0*297015<r, 
cr m fm = 0*429842, 


m' = O*063187or, 
cr m , as 0*292979 or, > 
V/m' = 0*441775. , 


(13) 


The values of m and <r m /m obtained from the exact formulae (2) and (3) are 


m = cr 



0*690988 cr, 


<r m /m = * + sin-4 + 2^2-4) = 0*429842, 


(14) 


showing close agreement with the values given in (13) for the mean and coefficient of variation of m. 
We may therefore consider the mean and coefficient of variation of approximately evaluated in (13), 
to be of sufficient accuracy to warrant the conclusion that, for samples of size 4, the mean deviation from 
the mean leads to a more ‘efficient * estimate of the population standard deviation than the mean devia- 
tion from the median. As the distribution of the latter is not known for n > 4, we are not in a position to 
say whether this conclusion holds good, in general, for all values of n. 

In conclusion, it seems worth making the following point: 

(a) if expressions for the expectation and variance of m' were available and tables of its probability 
integral worked out, 

(b) if the efficiency of the m' estimate compared to the m estimate for n > 4 was not appreciably worse 
than for the case n = 4, 

there would be strong practical grounds for using m' rather than m in view of greater simplicity in calcula- 
tion. In both cases we must first arrange the observations in order of magnitude. Then if x x ^ x t < . . . < x n , 
m' may be calculated from the formula 


m' = - {(*„_,+! + av_ (+ , + . . . + *„) - (*, + x % + . . . + *,)}, 
n 


(15) 


where t = or £(n— 1) according as n is even or odd. 

For m, however, we must also calculate the arithmetic mean x and look for x k and x k+l between which 
x lies. Then tn can bo obtained from one of the three formulae 


nm _ _ x l -f x 2 + • ■ . + x k 

— - 


nm _ x*+x + . . . -f # n - 
2 (n — k) n — k Xf 

n*m _ ^+1 + .. . + x n Xi + ... + x k 
2 k(n — k) ~ n — k k 


(16) 


This certainly involves a rather longer process. 

It is interesting to note that m' becomes a special case of the measure of dispersion based on difference 
between the sums of the first and the last r observations (in order of magnitude) suggested by Jones 
( 1946), the range, becoming another special case of the same measure, when r = 1. 


REFERENCES < 

Fishkb, R. A. (1920). Mon. Not. R. Adtr. Soc. 80, 758. 
Godwin, H. J. (1945). Biometrika, 33, 254. 

Helmert, W. (1876). Astr. Nachr. 88, no. 2096. 

Jones, A. E. (1946). Biometrika , 33, 274. 

Pearson, E. S. (1926). Biometrika, 18, 173. 

Pearson, E. S. & Hartley, H. O. (1942). Biometrika , 32, 301. 
Walsh, J. E. (1940). Ann. Math. Statist. 17, 240. 



Miscellanea 


363 


On the method of paired comparisons 

By P. A. P. MORAN, Institute of Statistics, Oxford University 

M. G. Kendall & B. Babington Smith (1940) have discussed the Method of paired comparisons ’ for 
investigating preferences. Suppose we are given n objects A, and an observer is asked to choose 
between every pair. If A is preferred to B we write A-+B. If the observer is not completely consistent, 
either because of his own inefficiency or because the objects are not really capable of being ranked in 
respect of the quality under consideration, he may make preferences of the type A-+B ->C -+A, and we 
call this an inconsistent or circular triad. Write d for the number of circular triads in a given experiment. 
Then Kendall A Babington Smith show that 

£ = ( wodd ) 


n 8 — 4n 


(n even) 


may be regarded as a ‘coefficient of consistence' and lies between 0 and 1, being capable of attaining 
both these limits. 

Now suppose that each comparison is made at random so that there are equal chances that A -*• B 
and B-+A. The distribution of d is then of interest. They calculate tliis distribution exactly for 
n = 2, ...» 7 and conjecture that its moments are given by 

^ = i (3) • 


/** = -- 




these being polynomials in n which agree with their numerical calculations for n = 2, ...,7. They also 
conjecture that the distribution tends to normality when n increases. In the present note we prove 
these statements. 

Let the objects be numbered from 1 to n. Write P</* — 1 if the triad ( i , j, k) is circular, and 
if it is not. Then d = EP {fk , the sum being taken over all such triads. Now by enumerating the various 

cases we see that E(P ijh ) = J and so /4(d) = 2£(£P</*) = ~ Now consider fi 2 (d) = E[(EP iik ) z ]. 
Consider the types of terms which results when we expand this. In the first place we have terms 
typified by Pja„ and these contribute ~ to fi % (d). Similarly, we have terms typified by Pita-Pus# 
P I M Pi 14 and Am Am* and the number of these are respectively ? (w — 3)(w — 4), 3^™| (n — 3) and 
( o ) | n ) » whilst their expectations are each It follows that 

ao-ACMGH 


and so 


/*»(<*) = 



364 Miscellanea 

The calculation of fo(d) is a good deed more complicated. p 9 (d) = E((EP i/k )*) t and on expanding we 
get 16 types of terms, typified by 

Pin* 

P inPu 

P 183 P is 

P inPu 

After some calculation we find the sum of the contributions of these to be 

1 

2304 , 



P\nP 145 * 

P\nP 184 * 

PiuPuv 

\P 137 * 

P inP 134 P 185 * 

P 183 P 14 iP 156 * 

P li*P 14 &P 143 ’ 

P 145 * 

P\nP 834 P 184 * 

P 1 183 P 543 P 537 * 

P 183 P U 5 p 543 * 

P 578 * 

P inPlttP 587 * 

P 188 Pi 153 P m* 

P mP 345 ^ 845 * 


(n« - 6n 5 + 1 3n 4 4- 42n 8 - 158n 8 - 108n + 864}. 


3 /n\ 

Reducing to the mean we get p z (d) = — 1^1 (n, — 4). 

The calculation of fi A (d) is a great deal more complicated, there being 86 terms which are not zero; 


wo finally obtain, after lengthy calculations, 
, 1 


/*4 = 


55296 


and so 


{n 9 - 9 n s + 33n 7 + 45n 6 _ 582/1® + 504n 4 + 5732n 8 - 10692n 8 -.30024n + 80352}, 
1 


/* 4 = 


55296 


{972n 8 + 972n a - 3693 6n + 80352}, 


which reduces to the conjectured result. 

We now prove that the distribution tends to normality. To do this, it is sufficient (Kendall, 1943, 
p. 110) to prove that 


H m (2m ) ! /* am+1 


/t a m 


->0, for m=l,2 f .... 


2 m m ! * 

Consider the second of these first. Write Q iJk = P iik — J . Then 

/Wi Id) = *)*"•+*]. 


It is clear that for any given rn we could calculate p im +i(d) given sufficient labour, by expanding this 
and considering the expectation of each type of term and calculating the number of times it occurs, which 
will be a polynomial in n. Now consider the various types of terms in the expansion of ( EQ iik ) 2m ■* *. We 
classify these terms according to whether the ^’s have common suffixes. Let Qi jk Qi mn ... Q Pgr be a 
typical product in the expansion. If this can be separated into p groups of products of Q y s such that 
different groups have no common suffixes whilst within each group the triads are connected to each other 
by having common points, we shall say such a product ‘contains p groups’. Moreover, the number of 
times such a term occurs will be a polynomial in n whose order is equal to the number of distinct suffixes 
occurring in the product. If in a group a suffix only appears once, the inconsistency of the triad containing 
it is unaffected by the remainder of the group and the expectation of the product of Q'& in that group will 
be zero. It follows that in all those terms which contribute something non-negative to none of 

the groups can contain a suffix which appears only once. Therefore, since all terms which contain more 
than m groups will have at least one group consisting of a single Q f the expectation of such terms will 

be zero. It follows that /4 am+1 is a polynomial in n, of degree 3m -f 1 = £ integral part of ?(2m + 1)J at 

most, whose coefficients depend on m only. Hut fi 2 (d) is of degree 3 in n and so 

^J(m+ 1) ^ n 8m +l J ^ * 

Now consider p %m . This is a polynomial in n, and our aim is to find the order and coefficient of the term of 
largest order. In the first place we need only consider terms with m or less groups, for if a term has more 
them m groups, one at least will consist of a single Q and the expectation of the term will be zero. More- 
over as before, in each term, the suffixes in each group must each occur at least twice in that group. The 
number of times each type of term occurs will be a polynomial in n of order equal to the total number of 
distinct suffixes in that term. As we shall show the leading term in p% m (d) to be of order 3m, we can neglect 
terms whose frequency is less than this and therefore we can neglect all terms in which a suffix appears 
more than twice. Now consider a term with fewer than m groups and therefore containing a group of 



Miscellanea 


365 


order greater than two in the Q’s. As no suffix can occur more than twice, no Q can occur more than once. 
Consider any Q ijk , say, of this group. Then either the suffixes i,j, k are common to three other triads or 
one, i, say, is common to another triad and /, k common to a third. In either case evaluation of the 
expectation shows it to be zero. We can therefore restrict our attention to the case where there are m 
groups each containing two triads. Such groups can only be of the form Qf t8 , Q lu Q lu , Q in Q UB and the 
expectations of the two latter are zero whilst the expectation of Q\ u is 


The number of groups is m and the number of ways of choosing m such distinct pairs out of (EQnu) tm 
. (2m)! . . 

~ so that the leading term in is 

2 m rn! ^ 


(2m) ! 

2 m rn! 


(■&)” 


whilst the leading term in is fapi 3 and so 

H m ^ (2m) ! 
fif 2 m m ! * 

The distribution therefore tends to normality. 


REFERENCES 

Kendall, M. G. & Babinciton Smith, B. (1940). Biometrika , 31, 324. 

Kendall, M. G. (1943). The Advanced theory of Statistic s, 1. London: Charles Griffin and Co. 


Notes on the calculation of autocorrelations of linear 
autoregressive schemes 

By M. H. QUENOUILLE 


1. Bartlett (1940) has recently shown how, for a series of observations, we can test whether the 
observations can be adequately represented by a linear autoregressive scheme 

w n+i + a l w n-fy-l+ ••• + a j u n = e n+i* (1) 

where the a i are known or fitted values, and e n+i is an error component independent of u n+j _ v Bartlett’s 
test is based on the formula ^ ^ 

cov(r„r f+ ,)~ 2 PiPi+t> 

n — Si^„ao 

where r # is the estimate of the true autocorrelation p 8 between u t and u i+9 . 

00 

The purpose of the note is to demonstrate how, using generating functions, p { and 2 PiPi+t can 

f — _ 00 

be calculated with the minimum of computation. 


2. The method of generating functions seems to have been used by Wold (1938), who applied them to 
finding the variances and covariances of linear forms of finite extent in variables such as We shall, 
liowevor, be concerned with linear forms of infinite extent. 

It can easily be shown that the solution of (1) can bo written 

U n ^ 4 i 4“ “1" * * •» (2) 

where (1 +a 1 t + ... +a f t J )~ 1 = 1 4&i$4&2^ 2 + •••* (3) 

For example, if u n+ a 4 «w n +i 4* bu n = e n+s , 

sin 20 sin W 

we have (1 + at + bt*)~ l = (1 — 2# cos 04 x 2 ) 1 = 1 -f . x 4 — r — x 3 4 . . 

sin t/ sin u 


where 


cos 0 = — jp/yjb, x = t yjb t 

sin Id. 


si nld 26** 


sin0 ^(46 — o a ) 


and hence 



366 


Miscellanea 


3. Using this generating function we have 

^ £ Pi & = £{1 ... |l 4-cq- + • |J . (4) 

Now the expansion of (4) cam be achieved by splitting into partial fractions and, in general, we can let 
o -* 1 before this operation is performed. Thus 

— *** + -f 

(l-fo x f4- ... + a J t i )(t i + a 1 t i - 1 + ... + a,) ~ I+C4J + ... V + a x t imml + ... -fa, 

and using = — p_*, we can see that 

= — B i !a i (i = 0) 

'= Bi-ctiBf/a , (t= 1, ...jT — 1). 


Thus the autocorrelations will be generated by 

j i Bj + ••• + 1 J?j + Bji"^ + • • ■ + 

where the first term is expanded in powers of t and the second term is expanded in powers of $~ l . 


CO 

4. The expression (6) can now be squared to give a generating function for £ PiPw It will be 


necessary to split 


t(B 1 + B % t+... + B i t*-')(B l ti-' + B 1 V-' + ... + B i ) 
(1 4-Oi^-f- ... +a,jt i )(t i + a . .. + <*/) 


into partial fractions, but the labour will be reduced since the matrix of the coefficients of the equations 
in B { will be unaltered. 


5. To illustrate the method, we can consider Kendall’s series 1, which was used by Bartlett in his 
example. 

The autoregressive scheme for this series is 


so that er* £ Pi# — 


i«- oo 


“n+1- Mw„ + i + 0-5u n = e„ + „ 

<* _2B, + (B 1 + 2-2B,)« 

l-l-lt+0-6** 


(l-Mt + 0-6t*)(t*-M* + 0-5) 


, ®» + B t t 

**-l lV+0-6’ 


where 


ThuB 


[ BH = T- 3-7692 2- 11 541 TO"! 

B,J L — 2 1154 — 1-4423J [_1 J 

T 2-11641. 

L-1-4423J 

<r* = 2-8846, 


and 
ao that 


® t _ 0-7338-0-6* 1 0-7333 -0-5*-> 

‘ M i-i-i*+o-5* i,+ * i-mh+O'S- 1 ’ 

Pi - l lPi -1 + 0 - 5 p t _ t = 0 (i> 0). 


If we now consider the square of the expression (7) we have a product term 

2*(0-7333-0-6t) (0-7333* -0-5) __ -2B, + (B, + 2-2B i )* B,+ B,* 

(1 — l-l*+0-6t*)(**— l-lt + 0-6) — l-l-K + 0-5** + *«-l-l* + 0-6’ 


[ B,“| = T - 3-7692 2-11641 T- 0-73331 

BJ |_ — 2-1164 — 1-4423J |_ 1-6754 J 

= T 0-86861 
[_ — 0-7210J’ 


(7) 

(8) 


where 



Miscellanea 


367 


and, if we write 


— L PiPi+t 

{--oo 


I 5 a 

/ i— 00 


• . £ _ . 0-7333 — 0-6t / 0-7333 -0-6A* 

“ +2< l-M«+0-6<‘ +t (l — M«+0-«*) 

0-6086 -0-72MW 

+ 1-4420-H + terms in t- 1 

„ ' 2-0362-1-7210 . 0-5377 -0-7333t + 0-26< ! 

= 2-4420 +< ~ +<* 


1-1-K+0-W* 
■f terms in < _1 . 


(l-l-l< + 0-6t«) 


(») 


From this we have 2 p^- 2-4420 and the ‘correlations’ P ( of the correlations are 0-8334, 0-4321, 
- <-*-oo 

0-0006 Successive terras may be calculated using the relation 


P < -2-21P < _ 1 + 2-2P < _ 1 -MP < _ # + 0-26F < _ 4 = 0 (i>0). 


( 10 ) 


The caloulation of 2 Pj, suggested by Bartlett, can also be made by this method, but it is more 
{**-00 

arduous, and the first few terms will give a good approximation. 


7. The same method can be used to calculate the appropriate number of degrees of freedom for 
testing the correlation between two linear autoregressive schemes. 

n 

2 


In general, if E^Uj) = p^rr*. E(v { Vj) = p^cr' 2 and r = 


{«! 


JLk-'sA 


then 


n n 

£ £ Pu Pa 
varr~ -^— n . 

2 Pu 2 p\i 

i«l i—1 


For linear autoregressive schemes, p a = p<_„ p,' y = p,'^, and thus 

n + (n — 1 )p l p[ + . . . + pn-lPn-l 


varr- 


- 2 PiP'iln- 

i--oo 


Thus, provided n is large, r can be tested with n j 2 Pi Pi degrees of freedom, and the calculation 
00 

of 2 P< Pt Cttn be made by the above method. 

{—-00 


8. Finally, it is worth noting that, for autoregressive schemes involving m observables, it is possible 
to extend this method by the use of m parameters to calculate the correlations within and between the 
observables, provided that adequate estimates of the coefficients of the equations are available. In 
practioe, however, the procedure will often be reversed, and estimates of the coefficients of the auto- 
regressive schemes will be obtained by equating the theoretical and observed correlations. 


REFERENCES 

Baetlbjtt, M. S. (1940). J.R. statist. Soc. Suppl . 8, 27. 

Wold, H. (1938). A Study in the Analysis of Stationary Time-series. Uppsala. 



368 


Miscellanea 


Approximate formulae for the percentage points of the incomplete beta function 

and of the x 2 distribution 

By D. HALTON THOMSON 

Valuable ‘Tables of percentage points of the Incomplete Beta Function’ have been published in Bio - 
metrika (Thompson, 1941a) giving numerical values of percentage points at various probability levels 
between P = 0*995 and P = 0*005 for degrees of freedom = 2g and v t = 2 p ranging up to 120, and with 
an accuracy of five significant figures. In the same volume, a ‘Table of the percentage points of the x 1 
distribution’ was also published (Thompson, 19415) for values at the same probability levels and degrees 
of freedom ranging up to v = 100, and with an accuracy of six significant figures, thus supplementing the 
table of that function originally due to R. A. Fisher (Fisher & Yates, 1938). 

Cases arise in practice where the tails of the frequency distribution of a large population are of special 
interest, thus involving (in the case of the beta function) values of 2 p larger than 120, with a small 2g, 
or vice versa. Harmonic interpolation between 120 and infinity, however, leads to substantial errors, 
as is found when the values of the percentage points x are expressed in terms of their tail values (x or 
1 — a? < 0*6). This Note shows that close approximations to such extreme values may be determined by 
using the x* table as an auxiliary table to extend the Beta Function Tables in conjunction with certain 
simple alternative formulae. Comparisons within the range of the published Beta Function Tables are 
made indicating the degree of accuracy within that range. The accuracy of these formulae beyond that 
range increases rapidly with increasing 2 p £md decreasing 2 q (and vice versa), so that they can be applied 
with confidence under such conditions. 

The ‘normalized ’ form of the Incomplete Beta Function, in the usual notation, is 

p - «**> - , jy ,( '-*>-fc in 

in which, for a given P, 1 — x(q 9 p) in the tables denotes the upper percentage point and x( p 9 q) the lower 
percentage point. 

It is known that, when p is large and q is small compared with p , this form tends towards the Incomplete 
Gamma Function „ rt 

P = ~ l f 

P(l) J « 

where x(p t q) = e~ *. This in turn may be transformed to the x a distribution by putting pt =. [^(/ > )J/2. 
For a given large p and small q , therefore, the percentage point in terms of x 2 is given Approximately by 

*(2p,2 9 )Sexp£-^PJ, (2) 

where 2q = v in the X % table. This expression gives the exact value of x, when 2 q = 2, but for larger 2 q 
the error, which is consistently negative, increases rapidly with increasing 2 q unless 2 p is very large — 
much larger than 120. It is, therefore, of limited practical use. The following modifications wore in 
consequence evolved. 

Approximation A 

Consider the constant of integration in the original form (1) which, when expanded, is 

(p + g~l)(p + g~2)...(p-H)p 

m 

Let the terms q— 1, g — 2, ..., l,0be averaged; the constant as a first approximation then becomes 

m 

The numerator suggests that a more accurate approximation for x would be obtained by substituting 
p 4* — 1 ) in place of p in (2), thus leading to 

x(2p, WS«p[-^i]. (A) 



Miscellanea 369 

A comparison of the approximate values of x obtained from (A) with the exact values in the Beta 
Function Tables, for all probability levels between F = 0*995 and F = 0*005, shows that: 

(o) The error is consistently positive, but much smaller than the negative error in (2); in other words 
the latter is slightly over-corrected. 

(6) For a given p/q and varying F, the error is nearly constant; it is smallest at F = 0*995 and increases 
gradually in the direction of F = 0*005. 

(c) For a given F, the error decreases rapidly with increasing 2 p and/or decreasing 2 q. 

(d) Provided that p/q is larger than 4, the value of x is within 0*5 % of the exact tail value; if p/q is 
larger than 10, the error is within 0*1 % of that value. 


Approximation B 

The exponent in (A) may be written 

= ;&(*») 2 g 

2p+q—l 2 q 2p + q—l 

= &tp) i 

- 2, 


The factor in square brackets is equivalent to the first term in the known expansion of the form 

lo »( n i i )'- 2 |i^n + 3(i;‘-iji + -') > 

where n = (2/>+2</— \)/2q t which converges rapidly when n is large; i.e. when 2 p is large compared 
with 2 q. The above exponent may therefore be written 


which, when inserted in (A), leads to 


*U P) 1 / 2p-l \ 

2 q g \2j>+2g-l/’ 

«*>•■*'* (ii&- ,)’• 


(B) 


where k- \x? M (I‘)'\H‘2q). 

A similar comparison with the Beta Function Tables, for the same range of probability levels, shows 
that: 


(а) Approximation (B) gives generally more accurate values than (A), except when 2 q is very small, 
in which case they are nearly identical. 

(б) For a given p/q and varying F, tho error is negligible in the vicinity of F = 0*25; it increases 
negatively in the direction of F = 0-995, and positively in the direction of F = 0*006, the largest errors 
occurring at this level. 

(c) For a givon F, tho error decreases rapidly with increasing 2 p and/or decreasing 2q. 

(d) Provided that (2p) 8 /(2</) 2 is larger than about 150, the values of x are within 0*5 % of the exact 
tail value; this implies that if 2p is larger than about 150, this degree of accuracy is attained even when 
p/q is as low as unity. If (2p)*/(2q)* is larger than about 2000, the error is within 0- 1 % of the exact value, 
which implies that if 2pis larger than about 120, this degree of accuracy is attained when p/q is as low as 4. 

It wi 11 be observed that, when 2 q = 2, the formula does not revert exactly to (2), as is required by theory; 
but, unless 2 p is also quite small, the error in the computed value of x is negligible. 

The expansion of (B) lead* to +ts)v+sv\ 


where 


v 


&(P) 

2p + 2q— 1 


and 8 


g 

2p + 2q-\’ 


thus demonstrating its analogies with Campbell’s formula (C) below. 


Adaptation of Campbell’s formula 

In a book concerned primarily with quality control, Simon (1941) quotes (without the proof) a formula, 
due to Campbell (1923), designed to determine the average number of defectives in a sample of n, starting 
from the known average number in an infinite sample. It is a particular application of the general 
problem now under consideration, namely, the approximate determination of the percentage points of 



370 


Miscellanea 


the Beta Function, starting from the corresponding known values for the x* form of the Poisson exponen- 
tial binomial summation. It is given in the following form: 

° (C,W, 2S>r ,f) =^- 1 +Atl4^* + (3o + 2)^+«]n-* + ..., (3) 

a(c, oo, P) 

where a(c, n, P) = average number of defectives in which P is the probability of at least c defectives in 
a sample of n, o(c, oo, P) = average number of defectives in an infinite sample, A = \(c — a — 1), in which 
a = o(c, oo, P). (Simon quotes a = (a, oo, P), which is an evident misprint.) 

If Q denotes the value given by the formula, then 


a{c,n,P) = a(c, oo, P) (1 + G), 

so that 1 + O is the factor by whioh the average number of defectives in an infinite sample must be 
multiplied to give that in a sample of n. 

The change from Campbell’s notation to the more familiar general notation is given by 
a(c, n, P) = {1 - x{2p, 2 q)} n, a- o(c, oo, P) = [;&(P)]/2, 
where n = — 1. and c = q. 

Let u = o/n and r = (c — l)/(2n), 

then A=n(r—u/2). 

By inserting this notation in (3) and rearranging the terms, the formula leads to 

x(2p , 2q)~ l-|l + r|l + Jr + — u + (l + 


which expression includes the first four terms in the expansion of e~ u . 

Hence, for the determination of the percentage points x(2 p, 2 q), Campbell’s formula may, in effect, 
be re-written as 


x(2p f 2 q) ~ e~ u — r 


(l + tr+^ju+Ur*, 


(C) 


where 


u 


%(p + q- 1) 


and 


r 


g~l 

2(p + g-l)’ 


For large 2 p and small 2q, the last two terms become negligible, in which case it reduces to 

x(2p f 2 q) £ e- - r( 1 + Jr) u. ( C') 


Cochran’s approximation 


Cochran (1940), extending a method of Fisher’s (1925), has introduced a useful approximation for 
the percentage points of the Incomplete Beta Function, when both p and q are large , his method being 
to determine a sufficiently accurate value of z, as used in Fisher’s 2 -transformation. 

If y is the normal deviate at probability level P, then for a given pair of arguments 2p t 2 q, the following 
are first calculated, using Hartley’s (1941) notation: 


A = i(j/* + 3), A = 


Spq 


2p + 2q 

y , p) 

.J(A-A) ' pA 


Hence, by Fisher’s transformation, 


x(2p t 2q)Z 


2 p + 2 qe u * 


P> 


Comparison op formulae 

Table 1 compares the various formulae for upper percentage points at an extreme probability level 
(P = 0*995). Table 2 indicates their relative accuracy on a common basis, namely, as a percentage of the 
exact value of & or 1 — x 9 whichever is the smaller, so that the deviations from the exact values, when 
x or 1 — x approach zero, are duly emphasized. For intermediate probability levels, the percentages lie 
between the tabulated extremes. It will be noted that in the case of approximation B the errors pass 
through zero near the mid-range of P; in the cases of A and C the errors are positive for all values of P. 



Miscellanea 371 

The general conclusion* from these tables and other comparisons are that, for a given probability 
level Pi 

(a) When pjq > about 6, approximations A, B and C have about the same degree of accuracy, so that 
the simpler, A or B, have the advantage. 

(b) In the range 6 >pjq > 4, there is little to choose between B and C; but B is the simpler. 

(c) When pjq < 4 and the distribution approaches symmetry, D gives the best results, provided that 
2p and 2q are moderately large, say > 60. It may be, however, that B in this range will be sufficiently 
accurate for many purposes; ifp/q > 2, the maximum error ofx is about 2 units in the third decimal place. 


Table 1 . Comparison of approximate formulae at a given probability level 

P = 0-996 


2 P 

2 q 



x{2p, 2 q) 




A 

B 

c 

(Campbell) 

D 

(Cochran) 

Exact 

120 

2 

0- 9*16461 

Nil 

0-9* 10459 
-00*00002 

0-9 4 16461 
Nil 

0-999862 
-0 000054 

0-9*16401 


4 

0-9982908 

4 0 0000001 

0-9982900 
- 00000001 

0-9982907 

Nil 

0-997926 
-0 000365 

0-9982907 


10 

0-982764 
-I- 0 000005 

0-982765 
-0 000004 

0-982760 

4- 0000001 

0-982076 
- 0 000683 

0-982769 


20 

0-944002 
+ 0 000072 

0-943893 

-0*000037 

0-943941 

4- 0 000011 

0-943366 
- 0*000564 

0-943930 

1 

30 

0-902230 

4- 0 000280 

! 0-901839 

-0*000111 | 

0-902000 

4 0*000050 I 

0-901551 
! - 0*000399 

0-901960 


40 

0-86160 i 

4- 000067 

0-86070 
- 0 00023 1 

0-86106 

4- 0 00013 

0-86066 

-0*00027 

0-86093 


60 

0-78782 

4- 0 00203 

0-78622 S 

- 0*00057 | 

0-78621 

4 0*00042 

0-78568 

-0*00011 

0-78679 


120 

0-62698 

0-61430 1 

0-61855 

0-61620 

0-61620 



4- 000973 

- 0 00190 | 

4- 0 00235 

Nil 

— 


N . B . The figures in italics arc' the differences between the approximate and exact values. 


Table 2. Relative accuracy of approximate formulae 


Error of x(2 p y 2 q) expressed as a percentage of the smaller exact tail value ( x or 1 — x < 0*6) 






A 



B 



C 



D 


2 p 

2 ? 








(Campbell) 

(Cochran) 



p 

p / i \ 

0-995 

0*500 

0-005 

0-995 

0-500 

0-005 

0-995 

0-500 

0-005 

0-995 

0-500 

0-005 




o / 

/o 

O ; 

/<> 

o / 

'o 

o 

o 

% 

0 / 

/o 

0 / 

/o 

0 / 

/o 

% 

o / 

/o 

0 / 

/o 

% 

120 

12 

10 

* 

* 

* 

* 

* 

* 

* 

* 

* 

- 2-8 

- 0-1 

4 - 0-3 


20 

6 

4 - 0-1 

4 - 0-2 

4 - 0-2 

- 0-1 

♦ 

4-01 

* 

* 

4 - 0-1 

- 1-0 

* 

4 - 0-1 


40 

3 

4 - 0-6 

4 - 0-6 

4 - 0-7 

— 0*2 

* 

4 - 0-2 

+ 0-1 

+ 01 

4 - 0-3 

— 0-2 

* 

♦ 


60 

2 

4 - 0-9 

4-11 

4 - 1*2 

— 0-3 

* 

4 - 0-2 

4 - 0-2 

+ 0-1 

+ 0-6 

- 0-1 

* 

* 


120 

1 

4 * 2-5 

4 - 2-7 

4 - 4-4 

— 0-5 

♦ 

4 - 0-7 

4 - 0-6 

4 - 0-5 

4 - 1*7 

* 

* 

* 

30 

3 

10 

♦ 

♦ 

i 

4-01 

* 

* 

* 

* 

* 

+ 0-1 

-300 

u 

| - 1-4 

- 7*1 


5 

6 

4 - 0-1 

4 - 0-1 

4 - 0-3 

- 0-1 

- 0-1 

4-01 

* 

* 

4 - 0-4 

-111 

- 0-5 

- 1-8 


10 

3 

4 - 0-4 

4 - 0-5 

4 - 0-8 

-03 

-01 

4 - 0-3 

4 - 0-1 

4-01 

4 - 1-3 

-2-2 

-01 

- 0-5 


16 

2 

4 - 0-8 

4 - 1-0 

4 - 1*9 

- 0-5 

- 0-1 

4 - 0-6 

+ 0-3 

+ 0-1 

4 - 2-8 

- 0-7 

♦ 

- 0*3 


30 

1 

4 - 2-4 

4 - 2-7 

4 - 7-1 

- 1-0 

-0 1 

4 - 1*8 

4 - 1-0 

4 - 0*5 

4 - 8-2 

♦ 

* 

* 


* Error smaller than ± 0*06 %. 





372 


Miscellanea 


Wilson-Hxlferty approximation for ^-adjustment 
This formula (Wilson & Hilferty, 1931) for the percentage points of the x % distribution is 

where v represents the degrees of freedom, and y P the standardized normal deviate corresponding to 
probability level P. A table has been published in Biometrika. (Merrington, 1941), comparing the 
approximations derived from this formula with the exact values, at various probability levels between 
P = 0*995 and P = 0*005. It shows the remarkable accuracy of the formula, the maximum errors 
varying from about ± 0*04, when v = 30, to about ± 0*024, when v = 100. 

When those errors were plotted against the exact values on logarithmic paper, it was observed that 
for a given probability level, they varied inversely with <]v very closely. It follows that this square root 
relation may be used to adjust the Wilson-Hilferty formula, bringing the values computed therefrom 
still nearer to the exact values. 

If the difference (at v = 30) between the Wilson-Hilferty value and the exact value, when multiplied 
by -y/( 100/30), is treated as a coefficient G (which may be positive or negative), the required adjustment 
for any value of v is given by 

Adjustment = C/jv. 

For various probability levels P, the values of C arc' given in the following table: 


p 

C 

0*995 

+ 0*233 

0*990 

+ 0*157 

. 0*975 

+ 0*067 

0*950 

+ 0011 

0*900 

- 0*029 

0*750 

- 0*046 

0*500 

-0*013 


P 

c 

0-260 

+ 0*039 

0- 1 00 

+ 0*056 

0-060 

+ 0*035 

0-026 

-0*015 

0-010 

-0*120 

0-005 

-0*227 


A test against the Merrington Table shows that this adjustment leads to values of y 2 , between v = 30 
and v = 100 at all probability levels with an accuracy of ± 0*001 , i.e. to four or five significant figures. 
Since the Wilson-Hilferty approximation assumes a normal distribution about 1 — 2/(9y), which tends 
to unity as v increases to infinity, and since the adjustment tends to zero under those conditions, it 
follows that the latter may also be safely applied for an indefinitely largo v. 

It should be added that an adjustment on similar principles is not applicable to the Fisher approxi- 
mation for x*. 


REFERENCES 

Campbell, G. A. (1923). BeU Syst . Tech . J . January. 

Cochran, W. G. (1940). Note on an approximate formula for significance levels of z . Ann . Math . 
Statist. 1 1 , 93. 

Fisher, R. A. (1925). Statistical Methods for Research Workers . Edinburgh: Oliver and Boyd. 
1st edition. 

Fisher, R. A. & Yates, F. (1938), Statistical Tables for Biological Agricultural and Medical Research . 
Edinburgh: Oliver and Boyd. 

Hartley, H. O. (1941). Tables of percentage points of the Incomplete Beta Function. Methods of 
interpolation. Biometrika , 32, 166. 

Merrington, M. (1941). Numerical approximations to the percentage points of the x 9 distribution. 
Biometrika , 32, 200. 

Simon, L. E. (1941). An Engineer's Manual of Statistical Methods , p. 185. New York: John Wiley 
and Sons, Inc. 

Thompson, C. M. (1941a, 6). Tables of percentage points of the Incomplete Beta Function, Biometrika , 
32, 151. 

Thompson, C. M. (1941). Table of percentage points of the x % distribution. Biometrika , 32, 187. 
Wilson, E. B. & Hilferty, M. M. (1931). The distribution of chi-square. Proc . Nat . Acad . Sci . 
Wash. 17, 684. 



[ 373 ] 


REVIEWS 

j 

A First Course in Mathematical Statistics. By C. E. Wbatherburn. Cambridge 
University Press. Price 1 5s. 

An outstanding feature of the present statistical time is the number of text-books which are being 
written, and each one from a slightly different point of view. It is this which makes statistical theory 
interesting to study, for there can be no rigid approach to a subject which is used and expounded by so 
many and diverse persons. Professor Weatherbum has taken a rather formal mathematical exposition 
of the subject, and mathematical students will find his book both interesting and profitable to read. 
Numerical examples are given for the reader to apply the appropriate mathematical technique. It is 
possible that these would have bnen of greater utility if they had contained the material in its crude state, 
and had not been streamlined so that the application of the technique is immediately obvious, but 
nevertheless many new examples am there. 

I am not sure whether this book will be ent irely useful to students of other subjects than mathematics. 
While the mathematical analysis is undoubtedly clear it is possible that many will not be able to follow 
it in detail, and the conclusions of the analysis are not emphasized strongly. We may contrast with this 
Fisher’s Statistical Methods for Research Workers , whom no analysis is given, but where the relevant 
formulae and their interpretation are stated unmistakeably and their applications to material in its 
crude state set out so that the student may calculate for himself. 

Probability theory is the foundation stone on which the whole of statistical theory is built. It is dis- 
appointing therefore to find t hat, it is given somewhat perfunctory treatment in one chapter and the part 
it plays in (say) statistical tests of significance is not brought out and emphasized. There is a tendency 
nowadays in applying statist ical technique to regard the 5 % and 1 % levels of significance as sacrosanct 
and those coming fresh to the subject should learn that custom is the only reason for their choice. 

In spite of the crit icisms which I make 5 , however, I would recommend this book to students who have 
obtained some idea of the aims and objectives of statistical theory, and who are desirous of learning the 
development of the mathematical technique as well as its application. Professor Weatherbum’s mathe- 
matical analysis makes pleasant reading and may well throw new light on old methods for those who 
have loarnt the rudiments of the theory. 

F. N. DAVID 


Advances in Genetics, Volume 1. New York, N.Y. : Academic Press. 1947. 

This is the first number of a new periodical, probably an annual, summarizing recent work in various 
fields. Of the nine articles, ranging from 12 to 96 pages, with moan 42*6, S.D. 7-89, and a positively skew 
distribution, perhaps the most interesting to European geneticists will be that on the genetics of the 
eiliate Protozoa, Paramecium and Euplotes. Here Sonnebom describes work almost entirely done in 
America, with very surprising results. Thus Paramecium aurdia consists of at least seven endogamous 
varieties, each with two exogaraous mating types, which might bo called sexes were it not that in 
P. bursaria one of the varieties has no less than eight mating types. 

Hhrode and Lush’s article of the genetics of cattle gives a very condensed account of the large amount 
of work which has been done on the inheritance of economically important characters such as milk yield 
and growth rate. For example cattle biometricians have used the important concept of ‘heritability’, 
meaning the fraction of the variance of a character due to additive genetic differences. Within a herd 
this rarely exceeds 30 %. More space is devoted to work on the genetics of colour and the like, which is 
of far less economic importance, and the review of progeny testing methods is disappointingly brief. 
H owevor, the bibliographical references will be useful. Similarly, Atwood’s article on forage crops, though 
most valuable as a guide to the literature, does not give a detailed account of any of the biometric work 
which lias been done on grasses and clovers. 

Only two of the papers give data which a biometrician could immediately utilize. These are Gordon’s 
account of polymorphism in fish populations, and Spencer’s of mutations in wild Drosophila species, 
which unfortunately does not include some valuable recent Italian and Russian work. Gordon’s results 
call for the development of methods of estimating gone frequency similar to those used with human blood 
groups. Spencer is mainly concerned with results, but these are often given in sufficient detail to interest 
biometricians, though no attempt is made to summarize Wright’s fundamental statistical theory. 



374 


Reviews 


The other articles will be less attractive to biometricians, though it is of interest to see how statistical 
methods are demanded by the mere fact that the genus Crepis , whose evolution is reviewed by Babcock, 
includes 196 species, most of which have been examined cytologicaily, and between which 130 of the 
38,220 possible crosses have been made. 

The volume will be indispensable to geneticists. Biometricians certainly cannot neglect it. 

j. B. s. H. 

Mathematical Methods of Statistics. By H. Cramer, Princeton University Press. 1946. 

$ 6 . 00 . 

This book was written by Prof. Cramer during the war and has been published first in Sweden and then 
by an offset process by the Princeton University Press in the U.S.A. It is a definitive exposition of the 
theory of mathematical statistics as it existed in 1940 (about) and it is worth while therefore to consider 
its contents in some detail. Prof. Oram6r has divided his exposition into three parts; the first part is 
purely mathematical. The theory of sets and of such Lebesgue measure as is necessary for the under- 
standing of the second part is developed first of all. Such a development will be useful for the student 
of mathematical statistics coming fresh to the theory of measure in that he receives guidance as to what 
are the elements essential for him to understand. Chapters 11 and 12 on matrices* determinants and 
quadratic forms and miscellaneous complements do not fit into this general scheme but have obviously 
been included here as part of the mathematical equipment necessary for the student. Possibly Chapter 10 
on Fourier Integrals would have fitted more naturally into Part II but this is a matter of taste. 

Part II begins with a formal development of the theory of probability as given by the French and 
Russian schools of probability, and which Prof. Cram6r has already given in his Cambridge tract * Random 
Variables and Probability Distributions ’. The treatment here seems simpler, however, than in his earlier 
tract and there is a more practical flavour to his exposition. This part while still purely mathematical 
begins to introduce distributions and ideas which are familiar to the statistician. 

The title of the third part is ‘ Statistical Inference * and the main outline is that of small sample t heory 
developed during the past twenty -five years. The illustrations are numerical as well as mathematical 
and an attempt is made to show the student the numerical applications of the processes through which 
his mathematical theory leads him. The treatment is not exhaustive but the student who has assimilated 
this part will have little difficulty in extending his knowledge by further reading. 

As a textbook of mathematical statistics this book will remain unrivalled for many years to come. The 
mathematical exposition is clear, the development of ideas logical throughout, and the theorems arc 
presented in a very general way. Any student of mathematics who wishes to get a picture of what statis- 
tical theory is about will be led inevitably to a study of this book. To those who wish to become statisticians 
it will be necessary to supplement the reading by a practical course in which tho mathematical tools 
are tried out on numerical examples. This aspect of statistical work the book does not cover, but it is 
obvious that this would be the case from the title. It only remains to say to the student ‘This is a good 
book, buy it*. f. n. david 


CORRIGENDA 

( Biometrika , 34, 176-7) 

In J. Wishart’s paper on ‘The cumulants of the z and of the logarithmic X 1 and t distributions’, 
the following correction should be made : 

p. 176, 1st line of section 3: read ‘log 1 1 |* for ‘log t*, in two places, 
p. 177, 1st line following equation (32): read ‘log | x | * for ‘log x \ v 





BIOMETRIKA 


FOUNDED BY 

W. F. R. WELDON, FRANCIS GALTON and KARL PEARSON 


MANAGING EDITOR 

E. S. PEARSON 


ASSOCIATE EDITORS 

M. G. KENDALL JOHN WISHART 


in consultation with 

HARALD CRAMER J. B. S. HALDANE 

R. C. GEARY G. M. MORANT 

MAJOR GREENWOOD 


VOLUME XXXV 


1948 


ISSUED BY 

THE BIOMETRIKA OFFICE, UNIVERSITY COLLEGE, LONDON 
PRINTED AT THE UNIVERSITY PRESS, CAMBRIDGE 



PRINTXD IN G BIBAT BRITAIN 



CONTENTS OF VOLUME XXXV 

Memoirs 

I. James Fowler Tocher. With frontispiece 

II. On some modes of population growth leading to R. A. Fisher’s 
logarithmic series distribution. By D. G. Kendall . 

III. The studentized form of the extreme mean square test in the analysis 

of variance. By K. R. Nair 

IV. The estimation of non-linear parameters by ‘ internal least squares 

By H. O. Hartley. With three Figures in the Text 

V. The geometrical method in the theory of sampling. By David Foo. 
With two Figures in the Text 

VI. Proofs of the distribution law of the second order moment statistics. 
By John Wishart. With one Figure in the Text 

VII. Tests of significance in multivariate analysis. By C. Radha- 
krishna Rao 

VIII. Alternative systems in the analysis of variance. By N. L. Johnson 

IX. An examination and further development of a formula arising in 
the problem of comparing two mean values. By Alice A. Aspin 

X. On the power function of the longest run as a test for randomness 
in a sequence of alternatives. By G. Bateman. With three 
Figures in the Text 

XI. Sur les courbes de frequence de K. Pearson. By M. Dumas. With 
eighteen Figures in the Text 

XII. The distribution of the extreme deviate from the sample mean and 
its studentized form. By K. R. Nair ..... 

XIII. The Fisher-Yates test of significance in 2 x 2 contingency tables. 

By D. J. Finney 

XIV. The power function of the test for the difference between two pro- 

portions in a 2 x 2 table. By P. B. Patnaik. With four Figures 
in the Text 

XV. The analysis of contingency tables with groupings based on quanti- 
tative characters. By F. Yates 

XVI. The probability integral transformation when parameters are 
estimated from the sample. By F. N. David and N. L. Johnson. 
With three Figures in the Text 

XVII. A table for the calculation of working probits and weights in probit 
analysis. By D. J. Finney and W. L. Stevens 

XVIII. Some further notes on the use of matrices in population mathe- 
matics. By P. H. Leslie 

XIX. The transformation of Poisson, binomial and negative-binomial 
data. By F. J. Anscombe 

XX. Some theorems on time series.- II. The significance of the serial 
correlation coefficient. By P. A. P. Moran . 


PAGES 

1—5 

6—16 

16—31 

32—45 

46—54 

55—57 

58—79 

80—87 

88—96 

97—112 

113—117 

118—144 

145—156 

167—175 

176—181 

182—190 

191—201 

213—245 

246—254 

255—260 



vi ‘ Contents 

PAGE 8 

XXI. Some results in the testing of serial correlation coefficients. By 


M. H. Quenouille 261 — 267 

XXII. Fractional replication arrangements for factorial experiments with 
factors at two levels. By K. A. Brownlee, B. K. Kelly and 
P. K. Loratne 268 — 276 

XXIII. The relationship between finite groups and completely orthogonal 
squares, cubes and hyper-cubes. By K. A. Brownlee and 
P. K. Loraine 277 — 282 

XXIV. Systematic sampling of continuous parameter populations. By 

A. E. Jones 283 — 290 

Continuation of Dr Jones’s Paper. By M. G. Kendall . . 291 — 296 

XXV. The precision of observed values of small frequencies. By J. B. S. 

Haldane 297 — 300 

XXVI. Note on Professor Haldane’s paper regarding the treatment of rare 

events. By E. S. Pearson 301 — 303 

XXVII. .A further note on the mean deviation.' By H. J. Godwin . . 304 — 309 

XXVIII. A note on the asymptotic distribution of range. By D. R. Cox. 

With four Figures in the Text 310 — 316 

XXIX. On the role of variable generation time in the development of a 
stochastic birth process. By David G. Kendall. With two 
Figures in the Text . . . . • 316-r-330 

XXX. 2x2 Tables ; the power function of the test on a randomized ex- 
periment. By E. S. Pearson and Maxine Merrington. With 
eight Figures in the Text . 331 — 345 

XXXI. Statistical analysis of a non-orthogonal tri-factorial experiment. 

• By W. L. Stevens . . ... . • . . . . 346 — 367 

XXXII. Comparisons of heights and weights of German civilians recorded in 

1946-7 and Royal Air Force and other British series. By G. M. * 

Morant. With twenty-one Figures in the Text . . . 368 — 396 

XXXIII. Testing the significance of correlation between time series. By G. H. 

Orcutt and S. F. James. With fourteen figures in the Text .. 397 — 413 


Miscellanea pages 

(i) A note on the y 2 smooth test. By H. L. Seal 202 

(ii) Rank correlation and product-moment correlation. By P. A. P. Moran 203 — 206 
(in) Tests of significance in the variate difference method. By N. L. Johnson 206 — 209 

(iv) Review of M. G. Kendall’s The Advanced Theory of Statistics. ByB. L. 

Welch 210 

(v) Note on the median of a multivariate distribution. By J. B. S. Haldane. 

With one Figure in the Text 415 

(vi) A property of rank correlations. By H. E. Daniels . . » . . « 416 

(vii) Approximation errors in distributions of independent variates. By 

H. O. Hartley 417 

(viii) Correlations between y 2 cells. By F. N. David 418 

(ix) Note on ‘Proofs of the distribution law of the second order moment 

statistics ’. By John Wishart 422 

(x) Review of D. J. Finney’s Probit Analysis. By N. L. Johnson . . 423 








t 



MMES FOWLER TOCHER 
1864-1945 



Volume XXXV, Parts I and II 


May 1948 


JAMES FOWLER TOCHER 

James Fowler Tocher (1864-1946), chemist, ethnologist, biometrician, agriculturist and 
man of affairs, began adult life by opening a chemist’s business in Peterhead more than sixty 
years ago. Even sixty years ago, scientific chemistry was not restricted to professors or even 
analysts, and certainly proprietors of businesses sometimes made money. The Southron, 
however, unduly influenced by jokes about Aberdonians — most of which are based on the 
principle, Incus a non lucendo — would not t hink Peterhead a promising venue either for 
scientific research or earning a competence. In fact, the business was a financial success; 
Tocher disposed of it in 1912 and when he passed to the analytical branch of professional 
oliemistry, he had already made a name among chemists and biometricians (he was President 
of the Pharmaceutical Society of Great Britain in 1908). 

Like several other men who have done important scientific work Tocher owed a good deal 
to a local society, the Buohan Field Club. In the south, although the name Field Club is 
used, e.g. the Essex Field Club, Natural History Society is perhaps a commoner designation 
and can include all the descriptive sciences. It is quite possible that these societies are 
educationally almost as valuable as classes in technical colleges even now; sixty years ago 
they were invaluable. The Buchan Field Club published transactions; in these appeared 
Tocher’s earliest statistical papers, the very first in 1896. 

They were concerned with the ethnology of Buchan and form the basis of the wider 
studies with which his name is usually associated. Among the earlier papers is one, written 
jointly with a frequent collaborator, Mr, afterwards Prof., James Gray, on the ‘Frequency 
and Pigmentation Value of Surnames in East Aberdeenshire’. 

More than ninety years ago, William Farr wrote an essay on the statistics of surnames 
which appeared in the Sixteenth Annual Report of the Registrar General and may have been 
read by Tocher because it is reprinted in the memorial volume edited by Noel Humphreys 
which preserves some, but by no means all, of the delightful essays by an old master. Farr 
used the registers of births and deaths, and found in a sample of 276,405 names 32,818 
different surnames or 11-9 % different surnames. In Wales the proportion was much 
smaller. ‘The name -of Jbhn Jones is a perpetual incognito in Wales, and being proclaimed 
at the cross of a market town would indicate no one in particular.’ Farr touched on the local 
distributions and asked such questions as whether the present predominance of the Smiths 
were due to the original numerical strength of that great family, or ‘to some special 
circumstances acting upon the ordinary laws of increase, owing to which the descendants of 
the hammet-men have multiplied at a greater rate than the bearers of any other name? ’ 
Did*the progeny, he asked, of the tawny Browns increase faster than that of the fair- 
oomplexioned Whites? , 

Tocher interested himself in such problems. The data of the investigation of Gray & 
Tocher (printed in 1902) were pigmentation records of 14,661 school-children in East 
Aberdeenshire. These 14,661 children had 761 surnames, 5-2 %. A smaller proportion than 
in Farr’s data, as one might expect, since in a school population there would be many more 
brothers and sisters than in the birth and death registers for two quarters. In East Aberdeen- 
shire, Smith, scoring 203, was easily beaten by Milne with 267; third came Taylor, which is 
fourth on Farr’s list for England and Wales. Gray A Tocher worked out the mean and 
standard deviation in pigmentation units of the children with different surnames. ‘The 

Biometrika 35 


x 



2 James Fowler Tocher 

• 

Blondest surname in our list is Pirie....Next to Fine comes the surname Wallace. This 
points to the conclusion that the Wallace sept sprang from ancestors with a decided blonde 
tendency.’ 

The use of the standard deviation in this paper, as well as an allusion to Karl Pearson’s 
work on assortative mating, show that either Gray or Tocher or both were already readers 
of K.P. At what date Tocher made the personal acquaintance with Karl Pearson, which 
ripened into a warm friendship, I do not know. Five years later, in 1907,* Tocher’s first 
full-scale biometric memoir appeared in this Journal (Vol. 5, pp. 299-350). The original data 
are printed as an appendix to the volume. It is careful biometric study of the stature, 
pigmentation, and craniometry of inmates of Mental Homes in Scotland. In 1908 (Vol. 6, 
pp. 129-235, Appendix 1-63) is an equally careful study of the pigmentation of school- 
children in Scotland. These researches were financed by the Henderson Trust who published 
in book form (Oliver & Boyd, 1924) a summary by Tooher of the results recorded in the 
earlier papers together with those of later anthropometric observations on samples of the 
civil populations of Aberdeenshire, Banffshire and Kincardineshire and on soldiers of 
Soottish nationality. 

The use of superlatives is not a method of science — there is no greatest ‘scientist’ — but, 
whether any anthropometrio survey of another part of the British Isles so far undertaken 
can take precedence over the voluntary work of Tocher supported financially not by the state 
but by a private trust, is a question which may fairly be asked. I think the answer is ‘ no ’. 
Tocher’s analysis followed the lines of teaching so many of us older people look back upon 
with gratitude. It may well be that, since his time, the technique of craniometry has been 
extended and genetic research has modified some of the perhaps rather naive theories of 
inheritance popular a generation ago. But Tocher’s work remains of fundamental 
importance; it is a pity his example did not inspire Englishmen to extend his study of 
mental hospital populations. 

In 1925 ( Biometrika , Vol. 17, pp. 142-58) R. Greenwood, C. M. Thompson & H. M. Woods 
published a memoir on heights and weights of patients and there are scattered through the 
literature other statistical papere, but not, I think, any large-scale investigation. But the 
mind-body relation still interests us all. Our remote predecessors thought there was 
a oorporeal basis of the sanguine, melancholic, choleric and phlegmatic temperaments; 
psychologists and pathologists of to-day reject the bases imagined by our ancestors, but still 
search for a basis. Most people know that pulmonary tuberculosis takes a heavy toll in 
mental hospitals and perhaps dismiss the subject with some vague remark about the 
‘unfit’ or, alternatively, remember that the old physicians thought that grief or emotional 
shook was a factor predisposing to consumption. There seems to be no doubt that a particular 
form of mental disorder, dementia praecox, is particularly associated with tuberculosis, so 
intimately that one writer has maintained that dementia praecox and pulmonary tuberculosis 
can, in a sense, be regarded as modifications or expressions of a common<defect or diathesis. 
All this is, however, the merest speculation without a biometric study. Obviously to take 
the heights and weights of patients in institutions would only be a beginning. Tocher did 
not classify his data under diagnosis, perhaps he oould not have done so usefully because, 

* [During the summer of 1906, Tocher came south to Danby, in the Yorkshire moors, where K.P. was 
spending his vacation and the final arrangements for the publication of this memoir were no doubt then 
discussed. Tocher was also a friend of W. R. Macdonell whom he succeeded, at a rather later date, as 
part time lecturer in Statistics at the University of Aberdeen. Ed.] 



James FowUr Tocher 


3 


although 4436 males and 3961 females sound large numbers, when they are subdivided into 
age- and diagnosis-groups, the classes would grow small. The records of the mental hospitals 
of Great Britain in the forty years since Tocher did his work would suffice. But there has 
not been an English Tocher. 

In his later years, Tocher’s deserved reputation led to official or semi-official appeals for 
his help in work involving statistical analysis. He was very active in the study of milk 
production; in tackling suoh problems hiB chemical training was of great value. A matter of 
obvious importance is whether poor yield is due to nature or nurture or to both. Tocher was 
of opinion that both factors were involved but that ‘ the solution to the problem of deficient 
milk must be the careful selection of cows with good milk records and a good milking 
pedigree’ (see Stewart & Tocher, J. Dairy Res., January 1936). Tocher did not forget, as 
some academic writers did, the importance of vicious circles ; those who farm with insufficient 
capital cannot afford to buy pedigree stock and cannot afford to feed adequately what stock 
they have. He protested vigorously against the gross injustice of any legal presumption 
that because a particular sample of milk had a percentage of butter fat or a percentage of 
solids-not-fat below a prescribed minimum, the milk had been adulterated. It might, he 
thought, be quite reasonable to forbid the marketing of such milk, but certainly not right 
to affix a stigma to the farmer. Tocher was a good biometrician and a good practical 
Christian. 

In attempting to give a personal impression of Tocher, a writer who also knew two of his 
Aberdonian friends, Charles Creighton (1847-1927) and William Bulloch (1868-1941), both 
of whom left classical contributions to scientific literature, is tempted to compare them. 
Creighton was, in the old-fashioned sense of the word, the greatest scholar of the three. 
His History of Epidemics in Britain is a classic and other of his less-known writings would 
have been approved by Dr Johnson. Bulloch, less familiar than Creighton with the ancient 
writers, had an encyclopaedic knowledge of modem pathological literature and readers of 
this Journal are likely to remember his contribution to the Treasury of Human Inheritance, 
his study of Haemophilia which, incidentally, contains many illustrations of the irony which 
made Bulloch a famous raconteur. I do not think Tocher was the equal of these two friends 
in scholarship or literary power, but — although to some who perhaps over-value formal 
scientific training it may sound paradoxical — I should say he was a better, a more original, 
scientific investigator than either. Epidemiologists will never forget Creighton, bacterio- 
logists will never forget Bulloch. Biometricians, I hope, will read Tocher’s papers and show 
their gratitude in the way he would most have appreciated, that is by completing some of 
the tasks to which he put his hand. 

Tocher had always been a good friend of this Journal and he became a Trustee on the 
inception of the Biometrika Trust in 1936. Major Greenwood 

BIBLIOGRAPHY OF TOCHER’S PAPERS 
1889. The springs and wells of Peterhead. Trans. Buchan Fid. Cl. I. 

1891. Test for the detection of Besame oil in olive oil and isolation of another substance (sesamin) 
from sesame oil. Phann. J. 21 (24 Jan.). 

1893. A further note on sesamin. Phann. J. 23 (25 Feb.). 

1894. The production of gas from paraffin oils, and from pure members of paraffin and terpene series. 
J. Soc. chem. Ind., Land., 13 (31 Mar. and 31 May). 

1895. (With John Gray.) The ethnology of Buchan. Preliminary ethnographical observations. Trans. 
Buchan Fid Cl. 3. 



4 James Fowler Tocher 

1807a, The ethnology of Buchan. Part 11. Ethnographioal survey of school-children in Buchan, 
Trans. Buchan Fid 01. 4 . 

1 897 6. Report on scales of contributions and systems of levies. British Order of Ancient Free Gardeners, 

1900a. (With John Gray.) The physical characteristics of adults and school-children in East Aberdeen- 
shire. J. R. anthrop . Inst . 30 ( = N.S.'3). 

19006. (With John Gray.) The physical characteristics of the population of West Aberdeenshire. 
J. R. anthr<yp. Inst. (Reviews and Miscellanea), 30, no. 84 (=N.S. 3). 

1900c. The volumetric determination of red lead. Pharm. J. 64 (24 Mar.). 

1900<2. Address on the Free Gardeners and the Friendly Society Movement. 

1910a. (With John Gray.) The ethnology of Buchan. Part III. The physical characteristics of adults 
and school-children in East Aberdeenshire. Trans . Buchan Fid Ol. 6. (Contains the same data 
as 1900a, with some additions.) 

19016. The volumetric determination of phenol. Pkarm. J. 66 (23 Mar.). 

1902a. (With John Gray.) The ethnology of Buchan. Part IV. The frequency and pigmentation value 
of surnames in East Aberdeenshire. Trans . Buchan Fid Cl. 7. 

19026. (With John Gray.) The physical characteristics of the eskimo of Southampton Island, etc. 
Trans. Buchan Fid Cl. 7. 

1902c/ Scottish universities and pharmaceutical education. Pharm. J. 68 (1 Mar.). 

1902 d. The oxidation and determination of uric acid and urates. Pharm. J. 69 (10 Aug.). 

1903a. The characteristics of mankind. Aberdeen Free Press (21 Mar.). 

19036. A scheme of representation by territorial districts on the Pharmaceutical Council. Pharm. J. 71 
(15 and 22 Aug.). 

1905a. Medical inspection of schools. Address to the 31st Annual Congress of the Incorporated 
Sanitary Assoc, of Scotland. Inverness, Sept. 1905. 

19056. Anthropometric survey of the inmates of asylums in Scotland. Henderson Tr. Rep . 1. 

1906a. The aotivity of pepsin after brief contact with certain inorganic compounds. Pharm . J. 77 
(28 July). 

19066. The detection of citrates and tartrates. Pharm. J. 77 (28 July). 

1907 a. The anthropometric characteristics of the inmates of asylums in Scotland. Biomctrilca , 5, 
298-350 and Suppl. (Supplement is a reprint of 19056 above.) 

19076. Anthropometry, its aims and methods. Privately printed. 

1908. Pigmentation survey of school -children in Scotland. Biometrika , 6, 129- 235 and Suppl. 

1909. Some problems of interest to pharmacists to-day. Year Book of Phartnacy and Tram, of the 
British Pharmaceutical Conference. 

1910. The necessity for a national eugenic survey. Eugen . Rev . 2. 

1913. The criteria for purity of water — chemical and bacteriological. Proc. Aberdeen Ass. Civil Engrs, 
13 (1912-13). 

1915. (With Karl Pearson.) On criteria for the existence of differential death-rates. Biometrika , 11, 
159-84. 

1918. Food values and costs in relation to milk production. Scot. J. Agric . 1, no. 1. 

1919a. Investigation into the milk yield of Ayrshire cows. Trans. Highl. agric. Soc. Scot. 

19196. Variations in the composition of milk. Scot. J. Agric. 2, no. 3. 

1919c. Science and the struggle for existence. The Rotary Wheel. 

1919(2. Report on distillery by-products. Privately printed. f 

1921. On the need for consolidating the law relating to the sale of food and drugB. Trans . inc. sanit , 
Ass. Scot. 

1922a. The relationship between citric solubility of phosphates and yield of turnip crop. Trans. Highl. 
agric. Soc. Scot. 

19226. The citric solubility of mineral phosphates. J. agric. Sci. 12, pt. 2. 

1923a. Grass sickness in horses. (Reports by J. F. Tocher, J. W. Tocher, W. Brovp and J. B. Buxton.) 
Vet. Rec. 35. 

19236. Milk yields and associated factors. Proc. World Dairy Congr. 

1924a. Grass sickness in horses. Trans. Highl. agric. Soc. Scot. (Contains same data as 1923a above.) 

19246. Anthropometric observations on samples of the civil population of Aberdeenshire, etc. 
Henderson Tr. Rep. 2. 

1924c. A study of the chief physical characters of soldiers of Scottish nationality and a comparison 
with the physical characters of the insane population of Scotland. Henderson Tr. Rep. 3. 

1925a. Variations in the composition of milk. Edin. Med. J. (May). 

19256. Variations in the composition of milk. H.M.S.O. 

1926a. The pharmacist, his training and vocation. Pharm. J. 117 (9 Oct.). 



James Fowler Tocher 5 

10266. Errors of judgment in chemical analysis. Analyst (July). 

1926c. Variations in the composition of milk. Analyst (Dec.). 

1927 a. Causes of variation in the proportion of butter fat in milk. Soot . J. Agric . 10, no. 1. 

19276. Variation in the proportion of solida-not-fat in milk. Scot . J . Agric. 10, no. 2. 

1927 c. Sugar in milk. Soot. J. Agric . 10, no. 4. 

1928. An investigation of the milk yield of dairy cows. Biometrika , 20 B, 105-244. 

1929. The sale of milk. Suggested amendments of the law. Scot . J. Agric. 12, no. 4. 

1930. The study of literature. Lecture to Peterhead Literary Soc. 

1931a. Should the law relating to the sale of milk be amended? Trans . Yorks, agric. Soc. 

19316. What is probable error? Lecture to Institute of Chemistry of Great Britain and Ireland. 

1932. A New Year’s Message. Ohem. <5c Drugg . (2 Jan.). 

1934a. Raw versus pasteurized milk. Scot . J. Agric . 17, no. 1. 

19346. The services of Francis Galton and his school to physical anthropology and eugenics. Rep. 
Brit. Ass.f Section H. 

1935. The proportions of certain poisonous substances in feeding stuffs and their effect on livestock. 
Vet. Rec. 47. 

1936. (With Alice Stewart.) The effect of variations in feeding on dairy cows yielding milk of poor 
quality. J. Dairy Res. 7, no. 1. 

1937. Agricultural education and research in Scotland. Aberdeen Univ. Rev. (Mar.). 

1939a. Pasteurization of milk. Trans. Yorks, agric. Soc. no. 96. 

19396. Mortality among livestock duo to poisons. Vet. Rec. 51 . 

1940. The role of science. The Second Book of Buchan. 

1941a. Prehistory and early history. The Second Book of Buchan. 

19416. The bracken problem. Trans. Highl. agric. Soc. Scot. 



ON SOME MODES OF POPULATION GROWTH LEADING TO 
R. A. FISHER’S LOGARITHMIC SERIES DISTRIBUTION 

By DAVID G. KENDALL, M.A. 

Magdalen College, Oxford 


1 . R. A. Fisher (1943) in a co-operative study written with A. S. Corbet and C. B. Williams 
has developed a mathematical theory which describes with some success the relative 
numbers of animals of different species obtained when sampling at random from a hetero- 
geneous population. This problem was first considered in relation to (i) Corbet’s work on 
the distribution of butterflies in the Malay Peninsula, and (ii) the numbers of moths of 
different species caught in a light-trap over a given period of time (Williams’s data). Fisher 
began by assuming that for a particular species the number of individuals caught in time t 
would be distributed as a Poisson variable of expectation <ot, where o) may be called the 
intrinsic abundance of the species. He suggested that w might be distributed in the Eulerian 


(or x 2 ) form 


1 

m 


& 


e -kuia w k - 1 (fa (0 < ( 1 ) < 00 ) , 


( 1 ) 


where Q is the mean value of a) and A; is a constant parameter, and showed that the actual 
number caught would then follow a negative binomial distribution with index Jc*. In fitting 
such a distribution to Corbet’s data he obtained very small values of k, and this suggested 
that it might be worth while examining what would happen if Q and k were allowed to tend 
to zero in a constant ratio. In this way Fisher found that if a species were known to have been 
caught, it would be represented in the catch by exactly n individuals with a probability 

(n« 1,2,3,...), (2) 


ny 


wheref 




X = 


at 

1 at 


and a = lim Cljk. 


The success of this ‘logarithmic series distribution * in graduating the entomological data 
of Corbet, Williams and others implies that in the populations concerned the distribution of 
intrinsic abundance must be (for c o not too small) effectively of the form 

Ae^dwlu) (A = constant). (3) 

(This cannot of course be true for all co , for then the integral of total probability would not 
converge.) It will be noticed that the distribution (3) of intrinsic abundance is itself the 
continuous analogue of the logarithmic series. The success of (2) in describing the relative 
numbers of individuals caught is thus a challenge to biologists to provide a theoretical 
interpretation for (3), 

In this connexion it is worth noting that if one is concerned with a population containing 
only a finite number ( Z , say) of species, then the continuous distribution (3) can be replaced 
by a logarithmic series, and results similar to those of Fisher follow as before. Thus, suppose 
that the actual number v of individuals by which a particular species is represented in the 
whole population is distributed in the discrete form 

^ (v- 1,2,3,...), . (4) 


* This step in the argument is of course equivalent to that taken by M. Greenwood & G. U. 
Yule ( 1920) in another context. f I write In z for the natural logarithm of z. 



D. G. Kendall 


7 


where Y — — In (1 — X), and let p = 1 — e - * be the ohanoe that an individual will be caught 
in an exposure of duration t. Then the chance that the species will have n = 0, 1,2,3, ... 
representatives in the sample is given by* 


x” 


(» > 0 ), 


( 6 ) 


P 0 =l-^ and P„ = (l-P 0 ) 

j. ivy 

i.e. a logarithmic distribution with a zero terra added. Here 

x = 1 — e~ v and e v - 1 = (e 7 — 1) (1 - e~y ( ), (6) 

and so y < Y for all l ; for very small exposures p will be small and then 

y~yt(e r -l), 

while for very long exposures p will be nearly equal to unity and then 

y^Y- 

The expected values of S and N (the number of species and the number of individuals in 
the catch) are £ = Z(\-P 0 ) = yZjY and N = (e*- 1 )Z/Y. 

Thus as f — mx>, S-*Z and N -> (e r — 1 )ZjY, while for all values of t 

S = aln (1 + iV/a), (7) 


where a = Z/Y is a constant independent of the time of exposure and corresponding to 
Fisher’s ‘index of diversity’. Formula (7) is, in fact, identical with the well-known result 
due to Fisher (1943), although the derivation given here proceeds from somewhat different 
assumptions. 

Williams (1944 a, b) has shown that the logarithmic series (2) can also be applied to a great 
variety of other biological problems, in which the integer n is variously the number of species 
per genus, the number of genera per subfamily, the number of parasites per host, and even 
the number of research papers per biologist (published in a particular year). It is hard to 
believe that a single mechanism will be found to explain the relevance of the logarithmic 
series to all these problems, and it seems therefore well worth while to record any theoretical 
models which may be found to lead mathematically to this distribution. In the remainder 
of this paper I shall describe a number of discontinuous Markoff processes which lead to 
distributions of negative binomial and logarithmic series form, in the hope that some of 
these may be found to be of biological significance; 


2. The stochastic processes to be considered here will for convenience be described in 
relation to the growth of a hypothetical population of organisms, whose numbers fluctuate 
with the incidence of mortality, reproduction (by binary fission) and immigration from the 
outside world. Let n be the size of the population at time t; then n(t) is a random function 
which develops in the following manner: 

(i) If n > 0, the only possible transitions in an element of time dt are from » to n — 1 , n or 
n + 1 , and the transition probabilities are 

(n + 1 with probability ( n/l+K)dt , 
n - n with probability 1 — (ny + nfi + k) dt. 

In — 1 with probability nydt. 


* It appears that this sampling property of the logarithmic series distribution (which is easily proved 
with the aid of the generating function) has alreiMly been noticed by C. B. Williams (1947) and 
M. H. Quenouille. 



8 Modes of Population Growth 

(ii) (This is actually included in (i), but an explioit statement is desirable.) If n - 0, the 
only possible transitions in time dt are from 0 to 0 or 1, the transition probabilities being 

{ 1 with probability. Kelt, 

0 with probability 1 — kcU. 

Let P n (t) be the probability that at time t the population size is n; then it is possible to set 
up an infinite system of differential-difference equations which together with the distribution 
{■*»(*<>)} at some initial time t = t 0 determine the {i^(0} at all subsequent times, and so govern 
the mode of growth of the population. Two alternative sets of initial conditions will be 
considered here: 


(A) P 0 (-T)= 1 and P n (-T) = 0 (»>0). 

This implies that at time t — — T the population size was zero. 

(B) P„(-T) = 0, P 1 (-T)=l and P n (-T) = 0 (»>1). 

This implies that the population commenced with one individual at time t — - T. 

Next it is neoessary to give a biological interpretation of the effects associated with the 
constants ft, fi and k. 

The first of these, /?, represents the reproductive power of the individuals composing the 
population, the effects of sex and age being ignored. Thus it is supposed that if attention is 
fooused on any one individual at time t, it will be found to undergo binary fission at a time 
t + T, where r has the probability distribution 

e~fi T fidr (0 < t < oo). (8) 


An important consequence of the assumption (8) is that the time to the next subdivision, 
for any individual, is statistically independent of its past history, and in particular it is 
independent of the length of time Bince that individual was itself formed by the fission of 
its parent. At first sight it might appear that a bacterial colony would provide a good 
example of such a population growing by binary fission, but it must be remembered that the 
generation times of bacteria, while liable to considerable random variation, have a frequency 
distribution*' very different from (8) and possessing a pronounced non-zero mode. 

The n individuals present at any time are assumed to reproduce themselves independently 
of one another, and at the same constant mean rate. At each subdivision the parent can 
be thought of either as being replaced by its two offspring, or as only adding one new 
member to the colony and remaining a member itself; a transition »->»+ 1 then takes 
place. 

In a similar way the constant ft represents the loss to the colony due to ‘mortality ’. It is 
assumed that an individual does not lose its power to reproduce unless it ‘ dies’, and that it 
then ceases to be regarded as a member of the colony, so that a transition »->» — 1 takes 
place. Such a transition could, however, also mean the removal (by any means) of an in- 
dividual from the region considered; these two sources of loss are mathematically indis- 
tinguishable and will therefore be covered by the same symbol (i. Thus if an individual is 
observed at time t, it will disappear from the population at a time t + T, where r has the 


distribution 


e~r T fidr (0<r<oo). 


* See, for example, Kelly & Kahn (1932) and Hinahelwood (1946). 



D. G. Kendall 


9 


The /?- and //-effeots, described separately, are to be thought of as acting simultaneously 
and independently one of the other. Thus, when the p- and //-effects are acting together, the 
chance that an individual remains inactive for a time t and then subdivides during the 


subsequent rime interval dr is 


e-^+^pdr. 


Integration from r = 0 to r = co then gives the chance that the individual will subdivide 
before the //-effect has removed'it from the colony; this is 


J* e-^+^pdr 


P 

P+/i 


Finally the /(-effect is one of ‘ immigration from outside ’, i.e. it is supposed that from time 
to time individuals not initially members of the colony may join it and proceed to behave 
exactly like the other members. If, from a given time instant t, the next such ‘immigration’ 
ocours at time t + T, the distribution of r is assumed to be 


e _ * T /fdr (0<r<oo). 

The structure of the model will now be clear. It only remains to point out that the pro- 
babilities of a positive unit increment from the fi-effe ct or a negative unit increment from the 
//-effect in an element of time dt will each be proportional to n, the existing population size, 
while the chance of a unit positive increment from the /(-effect is the same for all values of n. 


3. The differential-difference equations of the process can now be written down. They are 

% t P n (t) = (» + 1 )// P n+ 1 (t) - {n(P +^ + K}P n (t) + {(n-l)p + k} P n ^(t), (9) 

if n > 1 , and ^-P o (0 = P P S) ~ kP o( 1 )- (10) 

It is convenient to define P n (t) as being identically equal to zero when n < 0; equation (10) can 
then be included within the general form (9). I owe to Dr M. S. Bartlett the remark that 
systems of equations of this type can most conveniently be solved with the aid of the gene- 
rating function „ 

s HM)= £ * (11) 

n^ — co 

It will be seen from (9) that 96(2, t) must satisfy the partial differential equation 

^ = {/t-(/y+//)2 + /?z 2 }^ + /r(z-l)?5, (12) 

which together with one of the boundary conditions, 

(A) <j>(z,-T)= 1, 

or 

(B) <t>{z,-T) = z, 

and the requirement that the expansion of <j> must contain no terms in 1/z, 1 /z 2 , is sufficient 
completely to determine the process. 

The differential equation (12) is of the standard Lagrangian form, the auxiliary equations 

^ dz 

(Pz-fi) (z-1) ~k(z- l)f 


( 13 ) 



10 Modes of Population Growth 

‘First integrals’ are (yff — — 1) — In (yff* — /c) = constant, 

and k In (/3z + In <p = constant, 

if k > 0 and and so the general integral of (12) is then 

0(z, <) = (/*- fa)-*!?® Jy~" , (14) 

where O is an arbitrary function to be determined from the boundary conditions. 

With boundary condition (A) it will be found that 

where A has been written for e ( £-> t)T (this is equal to the expected factor by which the popula- 
tion will be multiplied in a time interval T, when it is growing in the absence of immigration). 
Similarly with boundary condition (B) one obtains 


0(z.O) 


ifi-W 


r {/t(A-l)-(M-/?)2} 1 


/?(A — 1) l -»-*// 
'M- 


i) 2 |- 

p / 


(16) 


(M-/*) 1+w/i 

where A has the same meaning as before. 

When k = 0, the solutions are of a slightly different form. The general integral of (12) 
is then _ 

(17) 


0(z,O = d> 


Condition (A), as might be expected, gives <j>{z, t)= 1 ; this merely asserts that a zero popula- 
tion will remain zero if there is no immigration. Condition (B), however, gives 


m. m - H A - 1 > - - P) * I , M - 1 ) I-* 
sfeoi ^ ' 


(1.8) 


It will be noticed that in every case the solution is a regular function of z near z = 0, so 
that in the Laurent expansion the coefficients of the negative powers will all vanish, as 
required. 


4. It is now a simple matter to interpret these solutions. There are three cases of special 
interest. 

(i) Consider first a population growing from zero, so that (A) is the appropriate boundary 
condition, the population being established in the first instance by immigration from 
outside. The /c -effect is of course acting all the time, even after the colony has started growing, 
so that in general there will at any time be present a number of independent families, each 
descended from a different immigrant ancestor. According to (15) the population size n, 
after the process has been developing for a time T, will be distributed as a negative binomial 
variate with index Kjfi and mean value 

<>•> 

Thus if the expected size of the population will for large T grow geometrically at an 
exponential rate 

If, however, ft <fi, so that the force of mortality more than compensates for the force of 
reproduction, one will have 

lim E(n) = — (20) 

r-+« P~P 



11 


D. G. Kendall 


This is the mean of the stable distribution of population size which can just be maintained 
by the immigration rate k . If k were equal to zero the population would almost certainly die 
out in a finite time. 

(ii) Now suppose that k is very small, though still just greater than zero; to be more precise 
suppose that the ratio *■//? is negligible, so that while immigration is sufficient to start off 
the process, and to restart it if ever the population is wiped out by an excess of the ^ -effect, 
it is negligible when compared with the contributions from the ^-effect while the colony is 
actually growing. Then, exactly as in Fisher’s analysis referred to in § 1, it will be seen that 
the size distribution of such a colony observed at. any time will be a logarithmic series: in 
fact the distribution of n, given that n > 0, is* 


~ (n= 1,2,3,...), 


ny 


where as always y = - In ( 1 — x), and ^ 

(21) 

If T is very large, then 1 — |l when /?>/*, 

(22) 

and when /?</*. 

(23) 


The second case is the more interesting, for one can then let T tend to infinity and so obtain 
the stable distribution of population size when /? < fi. Of course in the limit when k = 0 
it is ‘ almost certain ’ that n — 0. If, however, a colony does exist (i.e. if n > 0), then it is 
almost certainly homogeneous (descended from a single immigrant ancestor), and its size 
n will be distributed in a logarithmic series. 

(iii) The case fi = /*, when reproduction and mortality just balance, requires a separate 
discussion. The equations auxiliary to (12) are now 

dz _ d<fi 
~ " A(z —i) 2 “ ~ k { z ^\) 4 >' 


and the general integral is 
Condition (A) then gives 


0(z, *) = ( 1 - 3)-* //? 0 |/ft + . 

^,0) = (l +/ ffT)-W/»{l- r ^ 2 p //? , 


(24) 


so that here again n is distributed as a negative binomial variate with index *//?, but the 
mean value is now E(n) = kT, (26) 


so. that the expected size of the population is linearly proportional to the time of exposure 
to immigration. If tcfft is very small the limiting conditional distribution of n (given that 
n > 0) is once again the Fisher series 


where now 


x n 

ny 


(n = 1,2,3, ...), 

BT 

X ~ 1+fiT’ 


( 26 ) 


* This conditional distribution is of course obtained by taking the ratio of the general term to the 
sum of all the terms but the first, in the negative binomial series, and then letting k //}-*■ 0. 



12 Modes of Population Growth 


5. It will perhaps have been noticed that in the analysis of the last seotion the immigra- 
tion rate k (when k/P is small) merely has the eifeot of ensuring that an observed population 
is almost certainly descended from a single immigrant ancestor who entered the region 
considered at some instant during tfie time interval of length T preceding the moment of 
observation. I think this gives us a clue to the true ‘explanation’ of the ooourrenoe of the 
logarithmic series in the solution to the problem just considered. Before exploring the matter 
further it will be found helpful to examine the growth of a single family by setting k = 0 and 
starting from unit population at time t = —T, thus employing boundary condition (B). 

From the generating function (18) it then follows that the distribution of the population 
size n at the time of observation (t = 0) is* 

Po = JK^ and (»><>), (27) 


where 


u 


M-i) 


The distribution is thus a geometric series with a modified zero term, the mean population 
size being E(n) = A, 

so that the expected population grows geometrically at an exponential rate — ft (which will, 
of course, be negative if /? < fi). 

It is of interest to evaluate the variance of the population size. This proves to be 

Var (n) = E(n- A) 2 = £±^A(A - 1), - (28) 

p—ft 

and the coefficient of variation of the population size is thus 

and if to- 

For large T these expressions become 

C.ofV.(»)~ if /!>?., (29) 

and C.ofV.W^^) if f>< F , (30) 

in the second case it will be recalled that E(n) -> 0 as T-+ 00 . 

The distribution of the population size thus behaves rather differently in the two cases of 
an exponentially growing and an exponentially decreasing population; this agrees with 
a conclusion reached by M. S. Bartlett (1937). He considered the similar problem when there 
is no spreading of generations and the generation-time is rigorously constant. As he points 
out, there is a connexion with Fisher’s theory of the extinction of rare characters and, one 
might add, with the work of Francis Galton, H. W. Watson, and A. J. Lotka on the extinction 
of surnames. t 


* (Added in proof.) Dr Bartlett has pointed out to me that the result (27) is stated by N. Arley 
& V. Borchsenius in Acta Math. (1945), 76, 298-9. It is attributed by them to Dr C. Palm. 

t See Fisher (1930), Galton (1889) and Lotka (1931). The problem of the distribution of surnames, 
and its variation in time, seems not yet to have received all the attention it deserves. The surname is 
a ‘rare character' whose extinction can very readily be observed; normal social conditions ensure that 
it is inherited as if it were controlled by a gene totally sex -linked in Y. Reference may be made to the 
work of R. A. Fisher & Janet Vaughan (1939), and J. A. Fraser Roberts (1941-2), who have considered 
the relation between surnames and blood-groups. [See also reference to Tocher & Gray, p. 1 
above. Ep.t 



13 


D. G. Kendall 


When fi a« fi, so that the forces of reproduction and mortality just balance, the above 
solution must be modified. The appropriate results are most easily obtained by letting p 
tend to ft in the several formulae. In this way one finds 

ftT 


P ° ~ Y+pT 


where 


and P n = (1-P 0 )(1- 

pT 


■ u) u n ~ l ( n > 0), 


(31) 


u = 


1+pT' 

and so E(n) = 1, Var (n) = 2 pT and C. of V. (n) = J(2pT). (32), 

It is of interest to note that if p and ft instead of being constants are each proportional to 
the same function of the time,* tl;e above theory still holds, provided that T is everywhere 
replaced by 


£ 


1 lr(t)dt, 


where p = p o i/r(t) and ft = /i 0 ip(t). 

6. Consider now the size distribution of a colony developing in the absence of immigration 
and known to have originated from a single individual whose arrival in the region concerned 
occurred during the preceding T time-units. If the time of arrival of the common ancestor 
is a random variable uniformly distributed from t = — Tiot — 0, it will follow from (27) that 

i fw-irw-^. . >n 
n (>-/») (P*-W ( 


where A == . Now since 

this can be written 
while 


d ip(\-\)\ = p\ (p-n? 
dr\p\- f i) ~(p\-/i )»’ 

1 C u 


u* 

npT’ 


P 0 =l+^\n(l-U), 


u 


M,- 1 ] 

p\-/i 

T. Thus the distribution of the population size at the time of observation (t = 0) is 


U being the value of 
when r 



| and P„ = (l-P 0 )^ (n > 0), 

(33) 

where 

r Pi A-l) 

M-p* 

(34) 

y and A being defined as before. For large T, 


x~ 1 — | 

(*-?)x if ^ 

(35) 

and #2 

■-Plfi if p<fi. 

(36) 


In the latter case it is permissible to let T tend to infinity and so obtain the stable (logarithmic 
series) distribution for n (given that n>0). When/? = fi the (x, T) relation is x = pTftl+pT). 

It will now be clear why, in § 4 (ii), the boundary condition (A) together with the hypothesis 
n > 0 led to a logarithmic series distribution for n as xjp -* 0. For in these circumstances it 

* A discussion of the similar problem when p and ft are any (not necessarily the same) functions of 
the time will be given in my paper (1948). 



14 Modes of Population Growth 

would be almost certain that an observed population was wholly desoended from a single 
immigrant ancestor who arrived in the field of observation at an unknown time instant 
uniformly distributed between t — — T and t = 0.* 

7. The classical theory of population growth (see, for example, A. J. Lotka (1945) for 
a general review and extensive references) is largely based on a deterministic description of 
the phenomena, which leads to differential and integral equations for the expectation values 
of the random variables concerned (the total population size, and the numbers in the several 
age groups). Apart from the work on ‘extinction’ already mentioned, the first stochastic 
treatment of the general problems of population growth seems to be that of W. Feller in an 
important paperf which is unfortunately not generally accessible in this country. Feller’s 
work has been further developed by N. Arley ( 1 943), J who showed that discontinuous Markoff 
processes of a similar type are equally relevant in the theory of the ‘cascade showers’ 
initiated by cosmic ray particles. In particular he makes use of the simple birth-and-death 
process discussed here in § 6, and quotes Feller’s formulae for the mean and variance of n as 
functions of the time t. Arley gives an elegant method for determining an expansion for 
P n (t) in powers of (t - 1 0 ), and observes that the calculation of high order coefficients becomes 
very cumbersome. One can now see, on examining the complete solution contained in (27), 
that this is very reasonable, for P n (t) is quite a complicated function of the time t, although 
a simple one of the reduced variable u. 

The results of the present paper would not have been obtained without the generating- 
function technique for transforming the differential -difference equations into a partial 
differential equation of simple type. This device was suggested to me by Dr M. S. Bartlett 
in the summer of 1946, and he has also applied his method to a number of Markoff processes 
of interest in biology, one of which is the birth-and-death process of § 5. An account of this 
work is now available (Bartlett, 1947). In another place I intend to give a discussion of 
the most general birth-and-death process of this type, in which the birth- and death-rates 
ft and fi can be any functions of the time t (Kendall, 1 948) ; this development also makes use 
of the generating-function procedure. 

It is a pleasure to acknowledge my indebtedness to Dr Bartlett and to the many other 
friends whose comments have helped to clarify my ideas on this subject. In particular, 
I should like to thank Mr D. J. Finney and members of his Seminar in Biological Statistics 
in Oxford, and Mr F. J. Anscombe, Dr P. Jones and Dr C. B. Williams of the Rothamsted 
Experimental Station. 

* The condition «>0 is to be introduced after (33) has been established. If it wore imposed from 
the start, the a posteriori distribution of the time of arrival of the inunigrant ancestor would no 
longer be a uniform one, but the appropriate modification of the argument would lead to the same 
final result. * 

t Feller (1939). I am greatly indebted to Prof. Feller for the gift of a reprint of this paper. In private 
correspondence he tells me that he also, in unpublished lectures, has solvod the equations governing 
the birth-and-death process discussed here in §5. 

t Arley’s monograph is also of great value in presenting a useful account (including several new 
developments) of the Kolmogoroff-Feller theory of Markoff processes, especially those of the discon- 
tinuous type used here. Another excellent account has been given receptly by O'. Lundberg (1940). For a 
general introduction to this subject, reference may be made to my review article (1947). 



D. G. Kendall 


15 


REFERENCES 

Abley, N. (1943). On the Theory of Stochastic Processes and their Application to the Theory of Cosmic 
Radiation . Copenhagen: G. E. C. Gads. 

Bartlett, M. S. (1937). Deviations from expected frequencies in the theory of inbreeding. J. Genet. 
35, 83-7. 

Bartlett, M. S. (1947). Stochastic Processes . (Notes of a course given at the University of North 
Carolina in the Fall Quarter, 1946. It is understood that copies of these notes are available on 
request.) 

Feller, W. (1939). Die Grundlagen der Volterraschen Theorie des Kampfes urns Dasein in Wahr- 
scheinlichkeitstheoretischer Behandlung. Acta Biotheoretica , 5, 11-40. This seems not to be gener- 
ally available in Great Britain, but there is an abstract in Math. Rev . I, 22. 

Fisher, R. A. (1930). The Qenetical Theory of Natural Selection , pp. 73-83. Oxford: Clarendon Press. 

Fisher, R. A. & Vaughan, Janet (1939). Surnames and blood-groups. Nature , Land., 144 , 1047. 

Fisher, R. A., Corbet, A. S. <fe Williams, 0. B. (1943). The relation between the number of species 
and the number of individuals in a random sample of an animal population. J. Anim. Ecol . 
12 , 42-58. 

Galton, Francis (1889). Natural Inheritance , Appendix F (which includes the contribution by 
H. W. Watson). London: Macmillan. 

Greenwood, M. & Yule, G. U. (1920). J. R. Statist. Soc. 83, 255-79. 

Hinshelwood, C. N. (1946). The Chemical Kinetics of the Bacterial Cell , Chapter x (see especially 
Fig. 64). Oxford: University Press. 

Kelly, C. I). & Rahn, O. (1932). The growth rate of individual bacterial cells. J. Bacteriol. 23, 147-53. 

Kendall, D. G. (1947). A review of some recent work on discontinuous Markoff processes with 
applications to biology, physics, and actuarial science. J.R. Statist. Soc. 110 , 130-7. 

Kendall, D. G. (1948). On the generalized ‘birth -and -death’ process. Ann. Math. Statist, (in the 
Press). 

Lotka, A. J. (1931). The extinction of families. J. Wash. Acad. Sci. 21, 377-80 and 453-9. 

Lotka, A. J. (1945). Population analysis as a chapter in the mathematical theory of evolution. In- 
cluded in Essays on Growth and Form, presented to D'Arcy Wentworth Thompson (edited by Le Gros 
Clark, W. E. & Medawar, P. B.). Oxford: Clarendon Press. 

Lunpberg, O. (1940). On Random Processes and their Application to Sickness and Accident Statistics. 
Uppsala: Almqvist and Wiksells. 

Roberts, *J. A. Fraser (1941-2). Blood-group frequencies in north Wales. Ann. Eugen Lond., 11 , 
260-71. (Includes references to earlier work on the distribution of surnames.) 

Williams, C. B. (1944a). Some applications of the logarithmic series and the index of diversity to 
ecological problems. J. Ecol. 32, 1-44. 

Williams, C. B. (19446). The numbers of publications written by biologists. Ann . Eugen., Lond., 
12 , 143-6. 

Willi AM8, C. B. (1947). The logarithmic series and its application to biological problems. J. Ecol. 
(in the Press). 



[ 10 ] 


THE STUDENTIZED FORM OF THE EXTREME MEAN SQUARE 
TEST IN THE ANALYSIS OF VARIANCEf 

By K. R. NAIR 

CONTENTS 

PAGE 

Part I. Expansion of the studentized integral: extension of H. O. Hartley’s results 


1. Introduction 16 

2. Expansion of the studentized integral in powers of v~ l 17 

3. Application to the incomplete beta-function 19 

4. Application to ‘ Student’s* integral 20 

6. Some further applications . 20 

Part U. Application to tests regarding the largest and smallest of several variances 

1. Introduction 21 

2. Probability integral of the largest variance ratio 22 

3. Construction of tables 24 

4. An illustration 26 

5. Probability integral of the smallest variance ratio 28 


Part I. Expansion of the studentized integral: 

EXTENSION OF H. 0. HARTLEY’S RESULTS 

1. Introduction 

Let x v . . . , x n denote a sample of n observations drawn from a normal population with mean 
fi and standard deviation cr . Then the difference between sample and population mean, 
x— fi y is normally distributed with zero mean and standard deviation crj^n. If we estimate 
the standard deviation of the parent, or, by 

8 = V[S(*-S) 2 /(»- 1)1, 

then ‘Student’ (1908) gave the distribution of 

t = {x— n)^nj8. 

As was to be expected, this distribution was independent of <r. The knowledge of the dis- 
tribution of t made it possible to draw inferences, with the help of evidence entirely supplied 
by the sample, about the location-parameter // of a normal population, without making 
any assumptions about the generally unknown scale-parameter, cr. 

Neyman & Pearson (1928) extended this notion to an analogous problem connected with 
the rectangular and exponential populations. 

Fisher ( 1 924) obtained the distribution of s'/s where s' and 8 are two independent estimates 
of o’, calculated by the root-mean-square method. This distribution is alap independent of <r. 
Fisher also showed that ‘Student’s’ /-distribution could be extended to a wide range of 
sampling problems. Sukhatme (1937) has shown that analogues to t and s'/s of ‘normal 
theory’ can be developed for an exponential population, again eliminating the unknown 
scale parameter. 

This notion of eliminating unknown scale parameters from the distribution laws of 
statistics has come to be known as ‘studentization’. 

t Part of a thesis approved for the degree of Ph.D. of the University of London. 



K. R. Naie 


17 


If instead of s' and s which give unbiased estimates of <r , we use other types of statistics 
which are ‘proportional* to cr,* such as the range, mean deviation, etc., it is clear that the 
distribution of the ratio of two such estimates will be independent of cr, although the analytic 
expression for this distribution may be difficult to obtain. 

Hartley (1938, 1944) has shown that if instead of s'/s we have a ratio w/s, where (i) w is 
a general statistic ‘proportional’ to cr calculated from a sample of u observations x x , ...,x n 
drawn from a normal population and (ii) 8 is independently distributed with v degrees of 
freedom, then the distribution of w/s can be derived without much difficulty if the distribu- 
tion of w/cr is known. His solution is described in some detail in the next section and its 
application to special problems is considered in the succeeding sections. 

The probability integral of w/s may be called the studentized integral of w. When y->oo 
the studentized integral becomes identical with the probability integral of w/cr, which 
may in turn be called the co-integral of w/s. 

2. Expansion of the studentized integral in powers of v~ x 

Let f v (w/s) denote the distribution function of wjs. When y->oo this gives the distribution 
function of wjrr, which we shall denote by f(w/cr). 

Let the probability integral of w/s be 

P,.{Q) = j^LHs)d(w/s). 

When v->ao, P y (Q) gives the probability integral of w\<r which we may denote as P(Q). 
Assuming cr to be unity, it follows (see, for example, Hartley) that 

P y (Q) = 2r(Jr)~ 1 Qv)»j“f- 1 er*’*P(Q8)d8. (1) 

Hartley made the significant discovery that a recurrence formula connecting P V {Q) and 
(Q) could be built up and that, thereby, P V (Q) can be obtained as the solution of a partial 
differential equation. He solves this equation by iteration, and at the third stage obtains 
the first three terms in an expansion for P V (Q.) in powers of V” 1 , namely, 

P y (Q) = P(Q) + ^, WP* - QP') + ^ (3 Q*p* - 2 Q*P m - 3 Q*P' + 3 QP'), (2) 

where the derivatives of P are taken at the argument Q. 

For v of the order 20 or larger this formula will in most cases give quite sufficient accuracy. 
Sometimes when v is small, however, we may need at least the next two terms of the expan- 
sion. The term in v~ 3 immediately follows from equation (24) of Hartley’s (1944) paper and is 

p Bx 

J2 (*W vi - + W v - W + 30"), (3)t 

where 0(A) = P(e A ) = P(Q) and a; = logy. 

The term in.v 4 will come from the fifth stage of the iteration which involves so.me lengthy 
algebra. In terms of x, 0 and A this becomes 

360(I 4 ) ( 15 0 Vl>l “ 30O0 V » + 332O0 vl - 147840V + 322400‘v _ 288000'" + 12800* + 61440'). ' (4) 

* A statistic w is ‘proportional* to cr if it transforms into w]<r when the observations are measured 
in units of cr. 

t The coefficient of <j> m is misprinted — ^ for — ^ in Hartley’s paper. 

Biometrika 35 


2 



18 


Studentized form of the extreme mean square test 


The derivatives of can be expressed in terms of those of P(Q). 

m = pm, 

« QP', 

= Q*P"+QP', 

<f> m = Q*P m + ZQ*P” + QP', 

ft* = Q*P» + QQ 3 P'" + 1Q*P" + QP', 

<f>v = Q5P* + lO^Piv + 25Q*P m + 15Q S P' + QP', 

<f> yl = Q«P vi + 15Q 5 P V + 65$ 4 P lv + Q0Q 3 P m + 31 Q 4 P" + QP', 
ii = Q 7 P vli + 21 $®P vl + 140Q 5 P V + 350$ 4 P lv + 301 ^P" + 63$ 8 P* + QP', 
0V1U = ^spvm + 28<2 7 P v11 + 266Q«Pvi + 1050<? 6 P' + 17<)lQ 4 Piv + MQQsp 1 " 

+ 127<?*P' + QP'. 

Substituting these in (3) and (4) we get, up to terms in v~*, 

P,(Q) = aQ + aJv + aJvi + aJvt + aJv*, 

where a 0 = P(Q), 

a i — l(Q 2 P" — QP')> 

, (3<?*P*v - 2Q*P’" - 3 Q*P" + 3 QP'), 

a 3 = ^ (Q 6 Pv‘ + Q B P V - 7Q 4 P‘v + 120 3 P W - 1 5 Q*P" + 15QP'), 

a 4 = oa7\/7d\ ( ^ 5Q 8 P vl ” + 60Q 7 P v11 - 250g«Pvi + 3li6Q s P v , 

- 285Q*P ,V — !iOQ 3 P m + 945 Q*P" - 045 QP'). 


(6) 


( 6 ) 


(7) 


From (7) we can derive the following results 

jdr'dQ - - Ij'HdQ = - f-jaM = - [%Ja,dQ - - 

jQ*P'dQ = - fea^Q = -ljQa 2 dQ = -^Q^dQ = - ’ fa^dQ, 

( 8 ) 

fo>rd<t - -|JWe = 

jQ*P'dQ = ~ljQ 3 ai dQ = - -Tk/^ 4 ^’ 

where the limits of integration are 0 and oo. 

These results enable us to express the moments of the studentized statistic in terms of 
those of the non-studentized statistic. Thus, up to terms in v ~ 4 , wo have 


3 25 105 1659 

4v + 327-2 + 12 8^3 + 2048^ 
4 8 


)J>” 


(Q)dQ, 


1- 


)/; 


Q*P'(Q)dQ, 


j~K(Q)dQ=j“p(Q)dQ-Q 

jy,w)dQ yyp(Q)dQ-^y 

jy’PM)dQ.jyPiQ)dQ-{l + ^ + 1 f)jyp’ { Q )i Q. 


(9) 



K. R. Nair 


19 


3. Application to the incomplete beta-function 

By applying expansion j(2) to the probability integral of the square root of a variance ratio 
having 2 p and 2 q degrees of freedom, Hartley obtained a new formula for the incomplete 
beta-function TV® + crt f« 


in terms of the incomplete gamma-function and its derivatives. 

Using the additional terms in v~ 3 and given in (6) this formula becomes 

?x(p> 9) = r (l , )'' 1 J 0 e ~ v y p ~ ld V + r(p)- 1 e-"wP{6 1 /(2^) + bj{2q) 3 + b 3 /(2q) 3 + bj(2q) t }, (10) 

where x = o)/(o) + g ) and 


b i = {(p-\)-(o), 

K = i{(P ~ 1 ) (P “ 2 ) (3p - 1 ) - (P - 1) (Op - 2)<y + (9p - 1 ) w* - 3w 3 }, 

b a = i{p(p-l) 2 (p-2)(p-3)-p(p-l)(p-2)(6p-3)w + 2p(p-l)(5p-l)w 2 

- 2p(5p + 1 ) (o 3 + (5p + 3) w 4 — w 5 }, 

b i = sioUP ~ 1 ) (p - 2) (p - 3) (p - 4) ( 15p 3 - 30p 2 + 5p + 2) 
-(p-l)(p-2)(p-3)(105p 3 -135p 2 +10p + 8)a> 

+ (p-l)(p-2)(315p 3 -180p 2 + 5p+12)w 2 

- (p - 1 ) (525p 3 + 75p 2 + 50 p + 8) w 3 
+ (525p 3 + 450p 2 + 1 75p + 2) w 4 

- 5(63p* + 99p + 46) w 6 + 1 5(7p + 9) w 6 - 1 5 w 7 }. 


This formula strictly applies only when p and q of the incomplete beta-function are such 
that 2 p and 2 q are positive integers. It is not suitable when p is large, because of the slow 
convergence of b i !(2q ) i . For small values of p and moderate or large values of g, however, 
the formula is quite useful. For large p and g, Wishart (1927) and others have developed 
useful methods. 

To give some idea of the accuracy of the formula when only terms up to b 2 are used, Hartley 
considered the example p = 1, o) = ™ and various integral values of q in the range 5 to 50. 
The agreement between the exact and the approximate values was very good for values 
of q > 20. For lower values of g, addition of the terms in & 3 and 6 4 makes the approximations 
closely agree with the exact values. 


Tabic giving exact and approximate values of I x (p, q) for p = 1 , oj = ^ and x — o)j(a) -f q) 




Approximation 

q 

IjXUiCI 

(i) 

(2) 

(3) 

5 

0*976 155 

0*974 627 

0*975 597 

0*976 788 

10 

0*987 945 

0*987 774 

0*987 896 

0*987 970 

15 

0*991 140 

0*991 093 

0*991 129 

0*991 144 

20 

0*992 571 

0*992 553 

0*992 568 

0*992 572 


20 


Studentized form of the extreme mean square test 

In this table approximation (1) uses terms up to b % , and approximations (2) and (3) 
are obtained by including terms up to 6 a and 6 4 respectively. The last approximation gives 
values in excess of the exact values, which shows that the value of the omitted remainder 
terms for oj = is negative, though negligible when q exceeds 10. 


4. Application to ‘ Student’s’ integral 

Fisher (1926) gave an expansion for the probability integral of ‘Student’s’ t, in powers 
of v -1 . He worked out the coefficients of the terms up to |H> and found that the maximum 
value of the fifth correction never exceeded 10-® when v was greater than 18. It will be 
interesting to compare the coefficients in Fisher’s expansion with those obtained by applying 
Hartley’s method. 

Since t is symmetrically distributed about 0, we may consider the probability integral 
of 1 1 1 . To get the expansion of this integral by Hartley’s method we have only to replace the 
P(Q) in (6) by the ‘oo-integral’ of 1 1 1, namely, 




( 12 ) 


Alternatively, since 1 1 1 is the same as the square root of a variance ratio having 1 and v 
degrees of freedom, putting p — J, q = \v and cj = U 2 in (10), we may obtain its probability 
integral in the form *| f j 4 

hi h Wr ~2F^- (,3> 


where 


T [ = W+l), 

1 


(I t 10 - 1 1< 8 + 1 4<« + 6t* - 3t 2 - 16), 

(16/ u - 375i 12 + 2225< 10 — 2141< 8 — 939< 8 - 213* 4 + 915<* + 945). 


6(4*) 

T — - 
8 6(4 a ) 

T = - 

4 360(4 4 ) 


(14) 


The probability integral of t from t to oo immediately follows as 


y; 


J(2rr) 


(15) 


The coefficients T v T 2 , T z and T A * are identical to those given by Fisher providing us with 
a useful confirmation of the general applicability of Hartley’s method. 


5. Some further applications 

In his 1938 paper, Hartley examined the studentized probability integral of the largest 
and smallest among k independent estimates of a variance, each havingd degree of freedom. 
The importance of this problem in the analysis of factorial experiments was first pointed out 
by Wishart (1938). As an expansion of the integral in powers of v~ l was not then known, 
Hartley approached the problem by an approximate method which was satisfactory for 
getting the lower percentage points of the smallest variance, but not for the more important 
case of upper percentage points of the largest variance. 

* The coefficient of t* in T t appears with a negative sign in Fisher’s (1926) paper but the correct 
sign has been used in a later paper, vis, Ann. Eugen. 11, 141-72. 



K. R. Naib 


21 


Meanwhile, Finney (1941) gave an exact solution for the studentized integral of the 
largest and smallest of k variances, each estimated with m degrees of freedom. It is not easy 
to make this exact solution amenable to numerical calculation of the percentage points for 
odd values of m. Hartley’s expansion (2) with the additional terms given in (6) could usefully 
be employed to get the percentage points when m = 1. Details of the method and tables of 
the 5 and 1 % points are given in Part II. 

Pearson & Hartley (1943) used expansion (2) to calculate the studentized integral of the 
range. This is the probability integral of {x n — x 1 )js, where x 1 and x n are the two extreme 
observations in a sample of n observations and 8 is an independent estimate of c r based on 
v degrees of freedom. The studentized range is of importance in quality control charts for 
industrial products. A closely allied statistic, namely, the studentized extreme deviate 
(x n —x)j8 (or (* — x x )/«) has useful applications in designed experiments when we are con- 
cerned with judging whether a single outlying treatment (best or worst) really differs from 
the rest. The probability integral of this statistic will be considered in another paper (Nair, 
1948) and tables provided for n = 3, 4, ..., 9. 

There are many other examples where the expansion of the studentized integral can 
usefully be employed.The applications cited above are convincing proof of its potentialities. 


Part II. Application to tests regarding the largest 

AND SMALLEST OF SEVERAL VARIANCES 

1. Introduction 

Let aj, ...,«! be independent estimates of an unknown variance cr 2 of a normal population 
each calculated from sums of squares having m degrees of freedom and arranged in ascending 
order of magnitude. Let «§ be another independent estimate based on i> degrees of freedom 
against which each of the first k estimates is to be tested for significant differences from sjj. 
In routine analysis of variance where such tests frequently occur, it is customary to test each 
of the k variances against «§ and separately declare whether there is a significant difference 
or not. Forming now for each of the independent the ratio 1\ = /«§ , it is obvious that the 

largest ratio, F k = , is more likely to be declared significant than any other variance ratio 
in the set. Indeed, for k — 20 it would be expected to be significant at the 5 % level of F 
for m and v degrees of freedom. This source of bias can be eliminated if the probability integral 
of the largest variance ratio is numerically evaluated. 

Although the theory discussed in this paper applies to any value of m, owing to practical 
difficulties, tables of the upper 5 and 1 % points of the largest variance ratio have been 
prepared only when m = 1 , which is the most important case in practice. Thus, as Wishart 
(1938) has pointed out, it has useful’application in the analysis of variance for 2 x 2 x 2 x ... 
factorial design. It could in fact be applied to the analysis of any designed experiment where 
the total variance due to treatment effects is split up into components each having a 
single degree of freedom. 

Another possible application is in the fitting of curves with orthogonal terms, e.g. ortho- 
gonal polynomials and harmonic analysis. It may well happen that while, taken as a whole 
there is no significant ourvilinearity, the coefficient of a single high order term turns out to 
be very large compared to the rest. The new tables will be useful in testing whether such 
isolated coefficients are significantly large. 



22 


Studentized form of the extreme mean square test 

The theory developed for the largest variance ratio could easily be extended to any other 
ranked ratio. Of these the smallest, namely, F x — has been considered. Occasions to 
test the significance of the smallest of k varianoe ratios are much less frequent than that of 
the largest. In the former case, we are concerned with the lower percentage points of the 
probability integral, in order that a test could be made whether an observed smallest ratio 
is significantly small. 

In field experiments, the smallest of k variances tested may become significantly small 
compared to the ‘error’ variance (sjj) if the m + 1 groups of plot values from which the m 
degrees of freedom of that variance were obtained showed a high negative intra-class corre- 
lation. The extreme value that this correlation can have, if there are l plots in each group, 
is — 1/(1— 1), so that it can be — 1 only if l = 2. Wishart (1938) again gives an example where 
a test for significance of smallest variance ratio when m = 1 is appropriate. 

A rather interesting use for the test of significance of the smallest variance ratio, when 
m = 1, can be found in certain methods of statistical control in sample surveys devised by 
Mahalanobis (1944). If, say, you are sampling agricultural fields in k districts and send two 
investigators to each district to collect half the quota of sample units into two randomly 
selected half samples A and B, a comparison of the A and B mean values for each district 
should give a clue as to whether the data have been properly collected. If one of the A-B 
differences is very large and the variance ratio significant it is usual to conclude that the 
two investigators concerned did not carry out the instructions in the same maimer or that 
some other personal bias had crept in. On the other hand if one of the k differences between 
(A, B) pairs is surprisingly small and turns out to give a significant smallest variance ratio, 
it may perhaps arouse a suspicion that the two investigators consulted each other and dis- 
honestly made the means of their data agree. This is the negative side of the argument in 
favour of a test of significance of the smallest variance ratio. On the positive side, it may 
often turn out that the smallest variance ratio is not significantly small, avoiding awkward 
aspersions on the reliability of the investigators. 


2. Probability integral of the largest variance ratio 

Let v P k {Q) denote the probability of 8 h /s 0 being ^ Q. This is the same as the probability 
of F k — being < Q 2 . When »>->oo, this probability will be denoted by P k (Q). 

Finney (1941) showed that ^ (Q) = M * (1 +A) -* (1) 

_!? o'? 


where 


M = r(£m)- 


r 


0A 


u iml e u du r 


( 2 )* 


and A is put equal to zero in (1) after differentiation. 

He found that (1) was not amenable to numerical calculations if m is odd. The simplest 
case is m = 2 for which the probability integral reduces to 


v Pk(Q) = i(-iy k c r (i+ 2rQ ^ 

r =*0 




(3) 


Hartley (1938) had suggested that if we assume the k variance ratios to be independent, 
an assumption quite justifiable if v is very large, A(Q) can be obtained approximately from 
the formula A(Q)HAiQ)} k - (*) 


♦ This appears as 1 — M in equation ( 10) of Finney’s paper through a printing error. 



K. R. Naib 


23 


For m = 2, this becomes 


^H-( ,+ v)T 


( 6 ) 


To get some idea of the adequacy of the approximate approach, Finney compared the 
5 % points for F k obtained by the two methods, when m = 2. He concluded that the approxi- 
mation was satisfactory when v was moderately large in comparison with k , and presumed 
that the agreement would generally improve with increase of m. The chief uncertainty 
according to him was, therefore, when m = 1 . 

An alternative method of calculating (1) which is particularly suitable when m = 1 is 
Hartley’s expansion of a studentized integral. Thus, using (1) of Part I, the integral itself is 



JUQ) = 

(6) 

where 

^k(Q) ~ ^2r(Jw) _1 (.J»i) lm j t ™- 1 e- l™* dx^ . 

(7) 


It could easily be verified that (6) and (7) lead to the same result as given in (3) when 
m = 2. 

In the form given by (6), v P k (Q) can be evaluated with sufficient accuracy using expansion 
(6) of Part I where P(Q) should be replaced by P k (Q) given in (7). The coefficients a 0 , a v a 2 , . . . , 
of this expansion will be seen to involve powers and derivatives of the incomplete gamma- 
function, thus leading to Laguerre functions. When m = 1 or 2 these reduce to certain Her- 
mite functions which are fully tabulated in the British Association Mathematical Tables , 
vol. i. I n the general case, the Laguerre functions have to be used ; these have been tabulated, 
although less extensively. 

In the special case where m = 1 , we have 

m) - (ji m 

This can be expressed in terms of the Hermite function 

Hh 0 {x) = f “e-W*d» (9) 


tabulated in the British Association Mathematical Tables , which also give values of 


Hh_ n (x) = [-^ n Hh 0 (x) (10) 

for » = 1 to 7. These have been used in preparing the tables described in § 3. 

When m = 2, we have 

= {1 — where x = Q*j2. (11) 

The first six derivatives of the right-hand side of (11) can be obtained from the British 
Association Mathematical Tables. We could therefore proceed with Hartley’s method for 
the case m = 2 as well. It seems simpler, however, to use the exact formula (3). Finney has 
prepared a table of percentage points when k = 2 and 3. We shall consider only the case 
m = 1 . 



24 


Studentized form of the extreme mean square test 

3. Construction of tables 

Writing P k {Q) as P k for brevity, we have, for m = 1, 

(*7r)**P fc = {^n-Hh 0 (Q)} k = A* (say), 1 

(irr) ik P' k - kA k ~ x Hh_ x , 

(i* )** PI ~k(k-\) A k ~*Hhl x - kA k ~ l Hh_^ 

(£*)** PI = k(k-l)(k- 2) A k ~*Hhl x - 3 k(k - 1 ) A k -Wh_ x Hh_ z + kA^Hh^, 

($*)** -Pi v = k(k - 1 ) (k - 2) (k - 3) A k -*HhL x - 6k(k - 1 ) (k - 2) A k ~ a ffhi x ffh_ a 
+ k(k-l) A k ~\3Hh\. z + *Hh^Hh_ s ) - kA k ~'Hh_ t , 

(i *)* PI = k(k - 1 ) (k - 2) (k - 3) (k - 4) A k ~ 6 Hhi x - 1 0 k(k - 1 ) (A - 2) (A - 3) A k ~*Hhl x Hh_ 2 ( 
+ 6 k(k -l)(k-2) A k -*Hh_ x (3Hh 2 _ 2 + 2 Hh_ x Hh_ 3 ) 

- 5 k(k - l)A k ~*(2Hh_Jlh_ A + Hh_ x Hh_ t ) + kA kl Hh^, 

(i”) ik Pk = l)(fc — 2) (* — 3)(* — 4) (k-5)A k ~ 9 Hhl 1 

- 15 k(k - 1) (A - 2) (A - 3) (A - 4) A k -*Hh*_ x Hh_i 
+ 5 k(k - 1) (A - 2) (A - 3) A k \3Hh 2 _ x HhL 2 + Mh?_ x Hh_ 3 ) 

' - 15 k(k -l)(k-2) A k ~*(Hh\ + mh_ x Hh_JlK s + Hhl x Hh_ x ) 

+ k(k- \)A k -\\Wh1 z + 1 5Hh_ 2 Hh_ t + 6Hh_ x Hh_ 6 )-kA ki Hh^. 

(12) 

These six derivatives are sufficient to calculate a 0 , a x> a 2 and a 3 of expansion (6) of Part I. 
Since a t involves Hh_ z which is not included in the British Association Mathematical Tables, 
it is not easy to calculate a 4 by the direct method. The calculation of a a itself is rather com- 
plicated by direct evaluation from the derivatives of P k and was therefore accomplished by 
an indirect method of numerical differentiation from a x and a 2 . The same method could 
have been used to calculate a 4 , but the idea was abandoned as that term does not appear 
seriously to affect the accuracy, unless k is large and v is small. A small panel of values of 
o 4 was, however, calculated by this method for select combinations of values of Q and k to 
get some idea of its range of magnitude. These are discussed later in this section. 

Since values of Q'P'f 1 were roquired, an auxiliary’ function 

^ = Eh^QA ix (13) 

was introduced in terms of which QrP £> became, for r = 1,2,3 and 4, 

(,W k QP' =kA k ~ih v 

(£")** Q*P" = kA k ~*{(k - 1 )h\- A 2 }, 

($7 T)* k Q*P m = AA*~3{(A-1)(A-2)A?-3(A-1)A 1 A 2 + A 8 } (14) 

(Itt)** Q*P = kA k ~ 9 {(k- l)(A-2) (A-3)A}- 6(A- l)(A-2)AfA s 

+ (i-l)(3A| + 4A 1 A 8 )-A 4 }.j 

Expressions for a 0 , a x and a 2 are 

-{M- 

= i*(^)~ ,fc [{(*- 1 ) Af - A s } A k ~ a - h x A k ~ l ], 

«2 = &WP l ' l -2Q*P'")-%a v 


( 15 ) 



K. R. Naib 


25 


It was decided to limit the values of k to ^ 10. Using the British Association Mathematical 
Tables , values of A k , h v h t , h 8 and h 4 were calculated to six decimal accuracy in the range 
1-6 (0*2) 6-8 for Q. Values of a 0 , a x and a 2 were then obtained and all the calculations checked 
by fifth order differencing on the National Accounting Machine. 

The method used to obtain a 3 was to express it in terms of a v a 2 and their derivatives 


a [ , a 2 and a 2 . Thus 


a a = ^tQ 2(l 2 — h$Q a 2 + &(Q a 1 4®i)* 


(16) 


The values of a[, a' % and a\ were obtained by numerical differentiation, which gave the 
final values of a 8 to two decimal accuracy; this was ample for our purpose. An independent 
check was provided for a few test values, taking k = 2. For obtaining these values, P lv , P v 
and P vl were calculated and substituted in the formula 


*3 = *&( Q*P yi + Q* py + 1 1 Q^) - l« 2 - Wh (17) 


and compared with those obtained with the help of ( 16). The comparison is set out in Table 1 . 
The agreement is satisfactory for our purpose. 


Table 1. Values of a 3 


Q 

Formula (16) 

Formula (17) 

2-2 

1-8190 

1-8242 

2-4 

2-3353 

2-3360 

2-6 

2-3178 

2-3176 

2-8 

1-7453 

1-7449 

30 

0-7801 

0-7799 


A further short-cut was used in calculating a 3 . Instead of using (16) for each value of k 
ranging from 2 to 10, a 3 was first calculated for k = 2, 6 and 10. By three-point interpolation 
between them, values of a 3 were obtained for the remaining values of k. 

Although a 4 was not used in the evaluation of JP k (Q), a few of its values were calculated 
by numerical differentiation of a x and a 2 , using the formula 


its Q* a 2 V ~ 2 + rrizQ 2a l ~~ 2 ~ iv&h + z%tz{Q a 'i — ^i)- 


These values are presented in Table 2. 



The contribution from a 4 to the probability integral may be as large as 0-01 near the 
upper 5 and 1 % points if fc = 10, v — 10. This will affect the percentage point of the largest 
variance ratio by as much as 1 in the unit’s place. If v = 20, only the first decimal will be 
affected by about 1 or 2. It is likely in practice that v will be of the order of 20 or more when 
k = 10, so that the table will not vitiate the level of significance to any serious extent. 



26 Stvdentized form of the extreme mean square test 

The values of v selected for constructing the tables of the upper 5 and 1 % points of F k 
were 10, 12, 15, 20, 30, 60 and oo to facilitate harmonic interpolation inside this range. 
Second-difference inverse interpolation was used to obtain the percentage points from the 
table of probability integrals. 

Table 3 shows the final results giving the upper 5 and 1 % points for & = 1 to 10. The 
values for k = 1 were copied from Merrington & Thompson’s (1943) tables, and the difference 
between these and the kth column (k> 1) shows how much out we might be in assessing the 
significance of each variance ratio singly, rather than as one value in a group of k. Each row 
and column of the percentage points in Table 3 were differenced to the third order as a final 
check. No suspicious-looking values were found. 


Table 3. Upper per cent points of the largest variance ratio * 


5 % points 


\jb 

VN. 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

10 

4*96 

6*79 

8-00 

8-90 

9*78 

10*52 

11*18 

11*79 

12*30 

12*87 

12 

4-75 

6*44 

7-53 

8-37 

900 

9*68 

10-20 

10-68 

11*12 

11-53 

15 

4*54 

012 

711 

7-80 

8*47 

8*98 

9*43 

9-82 

10-19 

10-52 

20 

4*35 

5-81 

6-72 

7-40 

7-94 

8*39 

8-79 

9*13 

9-44 

9-71 

30 

4* 17 

5-52 

6-36 

6*97 

7*46 

7*87 

8-21 

8-51 

8-79 

9-03 

60 

400 

5*25 

6-02 

6-58 

702 

7*38 

7*08 

7*96 

8*20 

8-41 

oo 

3-84 

5-00 

5*70 

6*21 

6*60 

6*92 

7*20 

7-44 

7-05 

7-84 


1 % points 


\ifc 

1 

2 

3 

4 

5 

6 

7 

8 

9 

— 

10 

10 

10-04 

13-17 

15*08 

16*43 

17*43 

18*25 

18*91 

19-48 

19*97 

20-41 

12 

9*33 

11*88 

13-52 

14-73 

15*69 

16*47 

17*12 

17*68 

18*16 

18*60 

15 

8*68 

10*82 

12*18 

13*21 

14*03 

14*72 

15*30 

15*81 

10*26 

16*66 

20 

8-10 

9*93 

11*08 

11-93 

12*61 

13-19 

13*67 

14*09 

14-49 

14*83 

30 

7-56 

9*10 

10*14 

10-86 

11*43 

11-90 

12*31 

12*06 

12-97 

13-26 

60 

7*08 

8-49 

9*34 

9-95 

10*43 

10*82 

11*15 

11*45 

11-72 

11-95 

00 

6-63 

7*88 

8-01 

9*15 

9-54 

9*87 

10*16 

10*41 

10-62 

j 

10*82 


* k is the number of independent variance estimates, each based on 1 degree of freedom, v denotes the 
degrees of freedom of the independent ‘error * variance. 


4. An illustration 

A good example to illustrate a possible application of Table 3 is provided in Wishart’s 
(1938) paper. He analysed the data (number of sticks) of a uniformity trial on asparagus 
as if it were an experiment with nitrate (N), phosphate (P) and potash (K) at two levels 
each, in eight blocks of four plots confounding the three-factor interaction between blocks.f 
The analysis of variance is set out in Table 4. 

t The purpose of this preliminary trial was to collect material for adjustment, by the covariance 
technique, of inherent fertility differences among the plots when the N, P, K treatments were given in 
the ensuing season. 




K. R. Naie 


27 


Wishart writes: 'The mean square for total "treatments” (6d.f.) is not significant, but 
on examination of the separate effects it would appear that the PK interaction is significant 
at the 5 % level = 4*41, as against the observed F = 7*02.] With many ordinary experi- 
ments we should be happy to claim such a result as indicating a real effect But since the 

treatments were not applied, the effect is entirely accidental.’ 

He then applies Bartlett’s test for homogeneity of variance (which in this case is equivalent 
to the Neyman-Pearson L x test) on the six treatment variances and finds that 'the set of 
mean squares is compatible with the hypothesis that it is homogeneous, and awkward 
explanations are avoided’. 

D. J. Bishop & U. S. Nair (1939) found that Bartlett’s test underestimates the significance 
if the degrees of freedom of some of the k variances are as small as 1 or 2. The 5 % point of 
L x which they give when k = 6 and v = 1 is 0*094 against Bartlett’s value of 0*077. The 
value of L x for the present example is 0*152. It is not significant,* even by the exact L x - test 
of Bishop & Nair. 


Table 4. Analysis of variance 


Variation 

Degrees of 
freedom 

Mean square 
(variance) 

Variance 
ratio (F) 

Blocks 

7 

6543-2 


N 

1 

488-3 

0-35 

P 

1 

69-0 

0*05 

K 

1 

34-0 

0*02 

NP 

1 

830*3 

0-00 

NK 

1 

57*8 

0-04 

PK 

1 

9705*0 

7-02 

(Total treatments) 

(0) 

(1874*1) 

— 

Error 

18 

1391*2 



Since only ove of the six variances is standing out prominently, Cochran’s (1941) test 
based on y t the ratio of the largest variance divided by the sum of the set of variances may 
appeal as more appropriate than the Li-test. The value of g for the six treatment variances 
of Table 4 is 0-87. The 5 % value of g given in Cochran’s table is 0*78. According to the gr-test, 
therefore, the variance for interaction PK is significantly large, making us reject a hypothesis, 
which we know to be true. 

To decide which of the two tests will be more appropriate in a situation where the hetero- 
geneity among the k variances is caused by the presence of a single variance much larger 
than the rest, would need a study of the power functions of the two tests for this particular 
class of alternative hypotheses, using the Neyman-Pearson theory. 

In the present example, however, neither the L x nor the </-test takes into account the 
estimated ‘ error ’ variance with the help of which it has been possible to say that the total 
treatment effects (6 i>.f.) were not significant. The question now to be answered is whether 
the variance due to interaction PK, being the largest of the six individual treatment effects, 
could be considered as significantly greater than the error variance. The variance ratio for 
PK is 7-02. The 5 % point in Table 3 for k = 0, v = 18 is 8*59. The observed largest variance 
ratio is, therefore, not significantly large at the 5 % level. 


* For L v significance is indicated by low, not higli, values. 



28 


Studentized form of the extreme mean square test 


6. Probability integral of the smallest variance ratio 

L et„ 2 > fc (g) denote the probability that sJsq is > q. The probability that the smallest variance 
ratio F x is > q 2 will also be given by v p k (q). 

Finney (1941) showed that ^ (?) = ( i _ jf)k ( i + A)-» (19) 

where M is as defined in (2) except for a replacement of Q by q ; and A is to be put equal to 
zero after differentiation. 

Using Hartley’s general studentized integral 



vPk(<l) = 2r(^)- J (Jv)^ f sr 1 e~ lw,{ P*(?*o)<&o> 

Jo 

(20) 

where 

p k (q) = ^2r(£m)~ 1 (£m)l"‘J x m ~ 1 e~ imx ‘ dx^j . 

(21) 


Finney remarks: ‘The convergence of the series by which significance levels for the 
smallest variance ratio are obtained is much more rapid than for the largest ratio. Also, as 
Hartley has demonstrated, the approximation given by an assumption of independence is 
very close even for small numbers of degrees of freedom.* 

These remarks could easily be substantiated when m = 2 for which the exact value of 
v p k (q ) obtained from either (19) or (20) is 

(.♦*r ,22 > 


On the assumption of independence as in (4), the approximate value for v p k (q) will be 



(23) 


Evidently (22) and (23) are not identical,* but when q is small and v is not too small, they 
will be very close to each other. 

As we are concerned with the lower percentage points of the smallest variance ratio, q 
will be small and hence approximation (4) should yield sufficiently accurate results for any m. 

We may, however, examine this more closely for the case m = 1. The exact value of 
„p k (q) can be obtained using the expansion (6) of Part I. Owing to the smallness of q, it is 
scarcely necessary to add terms beyond a v We may, however, include a 2 which is approxi- 
mately equal to —aj 8. The a 0 and a k terms are calculated from the formulae 

&o = PM = (y|J f = (1 -a) k , (24) 


<h.~ fc(9 2 )(l~ <*) k ~ 2 {(k-l)qz + £(g 2 + 1 )(l-a)}, (25) 


where a and z are quantities given in Table 2 of the Tables for Statisticians a'nd Biometricians, 
Part I. 

In Table 5, values of o 0 and a x are calculated for q in the range 0 to 0-1 at intervals of 0-01 
and for k = 1 to 10. The probability that the smallest of k variance ratios, F t is will 


be given by 




( 26 ) 


* Finney cl aims in equation (21) of his paper that, they are identical, obviously by oversight. 



K. R. Nair 


29 


Table 6. Table for calculating the probability integral of the smallest variance ratio* 



2 

3 

4 



Oi 

*0 

Gl 

a 0 

a i 

000 

1*000 000 

0-000 00 

1*000 000 

0-000 00 

1*000 000 

0-000 00 

001 

0*984 106 

0*003 99 

0*976 254 

0-006 98 

0*968 465 

0*007 98 

002 

0*968 341 

0*007 98 

0*952 890 

0-011 97 

0*937 685 

0-016 96 

0*03 

0*952 707 

0*011 97 

0*929 906 

0*017 95 

0*907 650 

0-023 90 

0*04 

0*937 204 

0*016 97 

0*907 301 

0-023 93 

0*878 352 

0*031 84 

0*05 

0*921 836 

0*019 97 

0*885 074 

0*029 90 

0*849 780 

0*039 74 

0*06 

0*906 600 

0*023 97 

0*863 225 

0*035 87 

0*821 924 

0*047 61 

0*07 

0*891 502 

0*027 98 

0*841 750 

0*041 83 

0*794 775 

0*055 43 

0*08 

0*876 540 

0*032 00 

0*820 649 

0*047 78 

0*768 323 

0*063 20 

0*09 

0*861 717 

0*036 02 

0*799 921 

0*053 72 

0*742 556 

0*070 90 

0*10 

0*847 034 

0*040 05 

0*779 563 

0*059 64 

0*717 466 

0*078 53 


l 


6 


7 

<1 \ 

«o 

a \ 

<*o 




0*00 

1*000 000 

0*000 00 

1*000 000 

0*000 00 

1*000 000 

0*000 00 

0*01 

0*960 738 

0*009 97 

0*953 072 

0*011 96 

0*945 468 

0*013 95 

002 

0*922 723 

0*019 92 

0*907 999 

0*023 88 

0*893 511 

0*027 83 

0*03 

0*885 927 

0*029 83 

0*864 725 

0*035 72 

0-844 029 

0*041 56 

0*04 

0*850 327 

0*039 68 

0*823 196 

0*047 44 

0*796 930 

0-056 09 

0*05 

0*815 893 

0*049 46 

0*783 357 

0*059 00 

0*752 118 

0-068 36 

0*06 

0*782 600 

0*059 13 

0*745 157 

0*070 38 

0*709 505 

0-081 30 

0*07 

0*750 421 

0*068 69 

0*708 543 

0*081 53 

0*669 002 

0*093 88 

0*08 

0*719 332 

0*078 11 

0*673 465 

0*092 42 

0*630 523 

0*106 05 

0*09 

0*689 306 

0*087 38 

0*639 874 

0*103 03 

0*593 986 

0*117 76 

0*10 

0*660 316 

0*096 48 

0*607 718 

0*113 34 

0-569 310 

0-128 98 


8 

9 

10 

g \ 




a t 

«o 

a t 

0*00 

1*000 000 

0*000 00 

1*000 000 

0*000 00 

1*000 000 

0*000 00 

0*01 

0*937 924 

0-016 94 

0*930 440 

0-017 92 

0*923 017 

0*019 90 

0*02 

0*879 253 

0*031 76 

0*865 223 

0*035 67 

0*851 417 

0*039 56 

0*03 

0*823 829 

0*047 35 

0*804 112 

0*053 09 

0*784 867 

0*058 75 

0*04 

0-771 603 

0*062 63 

0*746 886 

0-070 03 

0*723 056 

0*077 29 

0-05 

0*722 126 

0*077 50 

0*693 329 

0*086 39 

0-666 681 

0*095 03 

0*06 

0*675 559 

0*091 88 

0*643 237 

0*102 06 

0*612 462 

0*111 83 

0*07 

0*631 667 

0*105 70 

0*596 416 

0*116 95 

0-663 132 

0*127 60 

0*08 

0*590 320 

0*118 92 

0*552 679 

0*131 00 

0*517 439 

0*142 25 

0*09 

0*551 390 

0-131 47 

0*511 848 

0*144 14 

0*475 142 

0*155 72 

0*10 

0*514 758 

0*143 32 

0-473 766 

0*156 32 

0*436 017 

0*167 97 


1 ~ pP *( 9 ) = 1 (i - ^ , «« equation (26). 



30 


Studentized form of the extreme mem square test 

which can be calculated with the help of Table 5 for any given q, k and v. If this probability 
is less than 0-05 or 0-01, the observed smallest variance ratio q 2 can be said to be signifioantly 
small at the 6 or 1 % level respectively. 

The values of v p k (q) given by Table 5 have been compared in Table 6 with those obtained 
by the approximate formula (4), for v = 10, q = 0-10 and 0-01. 


Table 6. Comparison of exact and approximate values of the 
probability integrals * of the smallest variance ratio 


7 , 

g = 0-10 


0-01 

K 

loPki'l) 

{loPl(?)}‘ 

ioPk(g) 

{ loPlt?)}* 

l 

0-922 324 

0-922 321 

0-992 218 

0-992 218 

2 

0-850 989 

0-850 676 

0-984 500 

0-984 497 

3 

0-785 452 

0-784 596 

0-976 845 

0-976 836 

4 

0-725 221 

0-723 649 

0-969 253 

0-969 234 

5 

0-069 843 

0-667 437 

0-961 722 

0-961 691 

6 

0-618 911 

0-615 591 

0-954 253 

0-954 207 

7 

0-572 047 

0-567 773 

0-946 845 

0-946 781 

8 

0-528 911 

0-523 669 

0-939 498 

0-939 413 

9 

0-489 191 

0-482 990 

0-932 210 

0-932 102 

10 

0-452 604 

0-445 472 

0-924 982 

0-924 848 


There is satisfactory agreement between the two methods. Indeed, as is to be expected, 
the agreement is much better for q = 0-01 than for q = 0*10. The difference in the top row 
values in columns (2) and (3) is due to calculating the latter up to the term in v 4 using the 
incomplete beta-function expansion (10) of Part I. It will be seen that for this expansion, 
terms up to v 2 give a five decimal accuracy when q — 0-10 and this improves to six decimals 
when q — O’Ol. 

We may, before concluding, add an illustration for the use of the test of the smallest 
variance ratio. This is also taken from Wishart’s (1938) paper, where by the ordinary test 
of significance he gets variance for an interaction NP as significantly loss than the error 
variance. He calls this a ‘ negative interaction’. Applying Bartlett’s test, Wishart found that 
this ‘negative ’ interaction was not significant. The values concerned are = 0*01 , sj| = 5-57, 
k = 6, m = 1, v = 18. To apply the test for significance of the smallest variance ratio, wo 

calculate b\ = = 0-0018 giving q = 0-04. Using (26) and referring to Tabic 5, wo get 

1^6(0-04) = Oo + a^l -8/v)/v = 0-8268. 

The probability that F x is less than or equal to the observed value is abo’ut 0-17, which is 
much greater than 0-06. The observed F l is therefore not significantly small. 


In conclusion, I should like to acknowledge warmly the help and guidance I have reoeived 
from Prof. E. S. Pearson and Dr H. O. Hartley in the course of my investigations. 

* The probability integral is calculated here from q to oo, instead of the usual limits 0 to q. 



K. R. Nair 


31 


REFERENCES 

Bishop, D. J. & Nair, U. S. (1939). A note on certain methods of testing for the homogeneity of a set 
of estimated variances. J. Roy . Statist. Soc. Suppl . 6, 89. 

Cochran, W. G. (1941). The distribution of the largest of a set of estimated variances as a fraction of 
their total. Ann . Eugm ., Lond., 11 , 47. 

Finney, D. J, (1941). The joint distribution of variance ratios based on a common error mean square. 
Ann . Eugen ., Lond. f 11 , 130. • 

Fisher, E. A. (1924). On a distribution yielding the error functions of several well-known statistics. 
Proc . Intern. Math. Congress , Toronto, pp. 805-13. 

Fisher, E. A. (1926). Expansion of ‘Student’s* integral in powers of n** 1 . Metron, 5 , 109. 

Hartley, H. O. (1938). Studentization and large-sample theory. J. Roy. Statist. Soc. Suppl. 5 , 80. 

Hartley, H. O. (1944). Studentization, or the elimination of the standard deviation of the parent 
population from the random sample-distribution of statistics. Biometrika , 33, 173. 

Mahalanobis, P. C. (1944). On large-scale sample surveys. Phil. Trans. Roy. Soc. B, 231, 329. 

Merrington, M. & Thompson, C. M. (1943). Tables of percentage points of the inverted Beta (F) 
distribution. Biometrika, 33, 73. 

Nair, K. H. (1948). The distribution of the extreme deviate from the sample mean and its studentized 
form. Biometrika , 35, 1 1 8. 

Neyman, J. & Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes 
of statistical inference. Biometrika, 20 A, 175, 263. 

Pearson, E. S. & Hartley, H. O. (1943). Tables of the probability integral of the ‘studentized’ 
range. Biometrika, 33, 89. 

‘Student’ (1908). On the probable error of a mean. Biometrika, 6, 1. 

Sukhatme, P. V. (1937). Tests of significance for samples of the ^-population with two degrees of 
freedom. Ann. Eugen., Lond., 8, 52. 

Thompson, C. M. (1941). Tables of percentage points of the incomplete beta-function. Biometrika, 
32, 151. 

Wish art, J. (1927). On the approximate quadrature of certain skew curves with an account of the 
researches of Thomas Bayes. Biometrika, 19, 1. 

Wishart, J. (1938). Field experiments of factorial design. J. Agric. Sci. 28, 299. 



C 32 ] 


THE ESTIMATION OF NON-LINEAR PARAMETERS BY 
‘INTERNAL LEAST SQUARES’ 

By H. O. HARTLEY 
1. Introduction 

(1-1) The difficulties arising with curvilinear regressions 

In statistical regression of two variables y and x, the linear relation y = a+bx has received 
most attention. Apart from occasional * fitting of polynomials ’, non-linear relations have not 
been used frequently. As reasons for the reluctance on the part of statistical workers to use 
non-linear regressions we may mention here: 

(а) The complication in the computational procedure when estimating non-linear para- 
meters by efficient methods such as least squares. 

(б) The lack of exactness in tests of the goodness of fit and the difficulty of establishing 
the random sampling distribution of fitted statistics. 

(c) The fact that any transformations to linearity such as f(y) = bg(x) + a (which may be 
suggested by theory), usually involve certain unknown parameters in / and g which must 
be estimated from the sample and, therefore, bring in difficulties of type (6). 

( d ) The difficulty of deciding which of the many possible non-linear regressions is 
suggested by theory. 

As examples illustrating these difficulties we may mention here: 

(a) The iterative process in dosage mortality technique which necessitates the preparation 
of special auxiliary tables for each fitted law. 

(b) Methods of fitting the Gompertz curve by factorial moments (Sasuly, 1934) resulting 
in procedures of unknown efficiency. 

(c) The exponential law of diminishing returns: 

y = n l-e te ) (1) 

can be transformed to a linear relation between log,, (1 —y/P) and x: 

loge(l -y/9) = kx. (1') 

However, a knowledge of the ‘limiting response’ fi is usually required in order to make this 
transformation to linearity. If fi is assessed by ‘ inspection ’ the accuracy of the subsequent 
fit of lc cannot be ascertained, let alone the accuracy of the assessment of p. 

(d) In numerous applications polynomial regressions have been fitted without a biological 
or social theory to guide the fit and, as a result, difficulties have been encountered in inter- 
preting the real meaning of the polynomial terms. 

(1*2) The principle of ‘ internal regression ’ 

The principle here adopted is that of bringing most of the important regression laws under 
one generating principle : Most of the laws in physics and technology are generated by simple, 
mainly linear, relationships, between the function and its first and second derivative (e.g. 
velocity and acceleration) and the same applies to most of the important laws in the bio- 
logical sciences. We may name two examples: 



H. 0 . Hartley 


33 


(A) The exponential law mentioned above is generated by the first order differential 
equation gy 

dx 


-hfj+lcy, 


( 2 ) 


which is a linear relationship. If applied to fertilizer response in agriculture (k < 0), this has 
the simple meaning that the additional yield, dy, caused by an addition of fertilizer dx, is, 
in the first place, proportional to dx (i.e. = fi(-k)dx); but this constant rate of increase 
(P( — k)) is retarded by an amount ( — ( — k)y ) which in turn is proportional to the yield y 
already attained. 

(B) The ‘trade cycle ’ for, say, a production index y is sometimes regarded as being caused 
by the interplay of production y and demand z whioh are assu m ed to satisfy the relations 



dz 

dt 


-Py> 


resulting in 



(3) 


of which the general solution is the trade cycle law: 

y = 8in|y(a/?)« + y]. (4) 

The idea, now, is to fit directly to the data the generating linear law (such as (2) or (3)) 
resulting in linear equations for the parameter estimates, rather than fitting the non-linear 
regressions (such as (1 ) or (4)). However, some modications are required: whilst for a mathe- 
matical function y(x) the regression (1) and the differential equation (2) are identical con- 
ditions, it would be difficult to fit to an empirical series of observed y i a condition involving 
its differential coefficient. We therefore seek a finite difference equivalent to the differential 
equation (2). As is well known (see, for example, Bartlett, 1946; Cunningham & Hynd, 1946; 
Kendall, 1944) the first order linear difference equation of the form 

Vi-Vi-x = hi + a ( 5 ) 

is capable of generating exactly the exponential law of type (1). Integrating (or summing) 
(5) we obtain y } = bY . + aXj + c, (6) 

i 

where T s = £ y t . Now this equation may be regarded as a linear regression equation for the 
o 

dependent variable y i with its own progressive sum Y j as independent variable, and with 
playing the part of a second independent variable. Formally, therefore, the non-linear 
parameters $ and k of (1) can be estimated from the simple linear regression coefficients 
a, 6, c given in (6).f 

In a similar manner the second order differential equation (3) can be replaced by a linear 

i i 

relation between y i and its second sum, 2 K = 2 2 Vi> and such a relation can be shown to be 

i~0i-0 

equivalent to the harmonic regression law (trade cycle) (4). The period of this trade 
cycle can therefore be determined as a partial linear regression coefficient between y i and 2 I^. 

Turning now to the general case, we shall call any regression equation in which the 
dependent variate y is related to its own repeated sums 

i^ = 2y<> 2*,= 2 kvi (?) 

0 l-0i-0 


t In §( 2 * 1 ) we shall give the exact relation between the parameters y 0f k and a, 6, c. 
Biometrika 35 


3 



34 


Non-linear parameters by ‘ Internal Least Squares' 

etc., as independent variables, an ‘Internal regression’. This concept is, of oourse, closely 
linked with that of auto-regression in time series. Its usefulness lies in its dose relation to 
linear differential equations, which are known to generate most of the important regression 
laws and also give a better understanding of the physical or biological mechanism producing 
the respective curvilinear regression. 

The principle of estimation and goodness of fit that we shall use in conjunction with 
internal regression is that of least squares applied to the ‘integrated’ equation (6) and not 
to the corresponding difference equation (5) (see, for example, Bartlett, 1946; Mann & 
Wald, 1943). This method is, at first, accepted as a working rule and the formulae for the 
estimators derived and demonstrated in terms of examples. Only the first order internal 
regression is treated here (§2). The efficiency of the estimator is then considered under 
various assumptions of residual independence. In particular, the classical assumption of 
random residuals attached to the observed y { is fully investigated by comparing the efficiency 
of our estimators with the 100 % efficient transcendental maximum likelihood roots. To 
this end, large sample formulae for the variances of the estimators will be developed (§ 3), 
leaving the derivation of exact sampling distributions for goodnesB of fit tests to a later paper. 


2. The procedure op internal regression 


(2-1) The first order internal regression-, fitting of the exponential law 

Consider a sample of observed values y { corresponding to discrete integer positions, x t = t, 
of the independent variable x (e.g. a time series). If the y i satisfy exactly the difference 
equation (y, n -y t ) = - £%< +1 + y t ) + a (8) 

then, by standard finite difference technique, they also satisfy exactly the exponential law 

2/i = #(1 —/«***)> (9) 

where the limiting response p and the ‘exponential curvature’ k are given by 

P = ajb, — k = 2 tanh -1 £6 (10) 

whilst /is a constant of integration. In order to find the least square estimates for o and b 
it is convenient to distinguish two cases: 

(2-11) n =s odd = 2m +1 (i = — m, — m + 1, ..., — 1, 0, 1, ..., m), 

(2*12) n = even = 2m (t = — to, - 1, 1, ..., m). 

(2-11) We introduce +m 

y = (2to+ 1) _1 2 y t , = y { -y, & = i. (11) 

i— — m 

Hence I,i] i = = 0. (12) 

We rewrite (8) as (Vi+i~Vi) = ~iHVt+i + Vi) + (a~by) ‘ (13) 

and sum the difference equation form i = 0 to i = j, introducing at the same time as a new 
variable Sj the first sum of the r} { as follows: 


3r- 
S 0 = 
3r- 


t - 1 

iVo+ 2Vi+¥li 

0, 


i~l 


i+1 
Z_ Vi- 


for 1, 

for -1. 


( 14 ) 



t 

H. O. Hartley 

35 

The progressive sums of (13) are then found to be equivalent to the equations: 



Vj-Vo = -bS j +(a-by)Z j +c, 

(15) 

or 

Vj = —bSj+a'Zj+c', 

(10) 

where 

a ' = a — by, 


or 

II 

1 

(17) 

and c, c' 

are constants of summation. 



The least square solutions for o', — b and c' are then given by: 


-bZS f 4- o'SS, & + c'XSi = YS iVi ,' 
-bXS i £ i + a'Z& =ZViii, 

— blZSi + c'n = 0. 


(18) 


It is now easy to see from the definition of the 8, that they are exactly orthogonal to the 
Vj> i e. that +m 

S S iVi = 0. (19) 


m 


Using (19), subtracting the last equation in (18) from the first two and solving for —6 and 
-6 = A-^&E-S^), o' = <S) a S^), (20) 

A = S(S 4 - <S) 2 E£? - (SS t g*) 2 , S - n-^SSt 


o' we find 
where 


and the summation is extended from i = — m to i = m throughout. For practical use the 
fonnulae (20) must be further simplified; in particular, we wish to avoid the calculation of 
the deviates ij t . We therefore introduce sum values formed directly from the y { 


l—i 

Yj = iv u + 2 y. + iy,- for i>l» 

( — 1 

F 0 = <>, 

y+i 

3/ = - ho- 2 for 3 < - !. 

f = sy<. 


( 21 ) 


and note the following relationships: 


E&.Sf« = -JE 7< & (22) 

Sftf, -^=S7f+ Yn-'Zrjig - F 2 a- 2 E£? - »-*(2 F<) 2 . (23) 

E$ = SF ( . (24) 

With the help of these relations we reach the following working formulae for the estimators 
o', —6 and c' in the order of computation. From data calculate F i( EF* and F (see (21)), 


SFl, S|J-Jm(m+l)(2»+l). 

Compute YrtS = Ey<£?- Fn-*S£?, 

E(S,-5) 2 from (23), 

A = Eg?E(5-S) 2 -J(Sy^) 2 ,| 
o' = A -iSyAS(S-S)*, 

-6 = 

c' - n-ifcsy^ 



36 


Non-Unear parameters by * Internal Least Squares * 

Finally, we note the relation between the estimators o', —b and e' and the parameters 
of the original exponential law, i.e. the limiting response P, the exponential curvature k 
and /. We have g = a >j b + n -i Y, fp = a'/b - c', k = - 2 tanh" 1 (26) 

As an illustration and oheck of these formulae we apply them in example 1 below to the 
theoretical sequence y t = 1 — e - **, x t = 0(1)4. The table below is self explanatory, the work 
following the computational order of equations (26). 


Example 1. Theoretical series y i = 1 — e - **; estimation of parameters by internal least square 


x t 

Vi 

2Y, (from (21)) 

Si 

8 

0 

0-000 

- 2-129 

-2 

4 

1 

0-632 

-1-497 

-1 

1 

2 

0-866 

0 

0 

0 

3 

0-960 

1-816 

1 

i 

4 

0-982 

3-747 

2 

4 


Y = 3-429 


1-936 = 2EY, 


= 2-282, Sg? - 10, 

- 6-610, 

= - 1 - 360 , X(S t -S)*= 0 - 2119 , 

a’ — 0 - 290 , 6 = 0 - 926 , 

P = 1-000, Jfc = - 1-000, 


ST? = 6-0270, 

A = 1-666, 
c' = 0-179, 
fP = 0-136 = e- 2 . 


The true values of the parameters are, of course, exactly reproduced. 


(2-12) The case n — even = 2m 

The only modifications of the previous section consist in defining 

£i = £( 2 *~ 1) for i = 1 m; & = £(2i+ 1) for t = - 1, .... -m (27) 

i - 1 

and ^ = S Vi+iVi for j> 1, 

“W < 28 > 

J* = - E for i < — l , 

«— i 

and similarly for the Sj. All other definitions, formulae and results are the same as for odd n 
except that the summations are from i = — m to — 1 and from + 1 to 4 -m, omitting 0. 

Below we illustrate the procedure by fitting the exponential law of diminishing returns 
to yield data obtained in manurial trials of the National Agricultural Research Bureau, 
China, f Both trials give the response of a number of wheat varieties to 'the application of 
sulphate of ammonia at the rates of 4, 8, 12, 16 and 20 c./m.{ The responses are grain yields 
and we confine ourselves here to the mean varietal responses. In the first experiment 
(example 2) it will be seen that there is a marked retardation of the response to the higher 

t The data were kindly put at my disposal by Dr H. L. Richardson of the Imperial Chemical In- 
dustries Ltd. Many authors have applied the exponential law to the analysis of fertilizer trials (see, 
for example, Crowther & Yates (1941)). 

t c./m. = oattys per mou, the Chinese measure of yield and rate of fertilizer application. 




37 


H. 0. Hartley 

fertilizer rates. In the second experiment (example 3), there is hardly any retardation and 
the estimate of the limiting response is, consequently, very high. The fit is illustrated in 
Figs. 1 and 2. 

c/m 



limiting response, y = 530 c/m. 


Fig, 1. Wheat yield; response to rate of fertilizer (example 2). 

c/n» 



0 4 8 12 16 20 c/n» 

limiting response, y = 1786 c/m. 

Fig. 2. Wheat yield; response to rate of fertilizer (example 3). 

Nor mall y one would, of course, require more than six observations for fitting a three 
parameter regression. However, the estimation of exponential curvature and limiting 
response in short series is akin to the calculation of the ‘quadratic effect’ customary in 
fertilizer trials with only three levels. 



38 


Non-linear parameters by * Internal Least Squares' 

Example 2. N -fertilizer trial with wheat, Hopen, Tinghsien, 1936. Fit of exponential 
law of diminishing returns to mean varietal responses 


Fertilizer 
rate in c./m. 

Response 
y t = 10 x yield 
— 1500 

27, 

2£, 

48 

0 

127 

-1187 

-6 

25 

4 

161 

- 909 

-3 

9 

8 

379 

- 379 

-1 

1 

12 

421 

421 

1 

1 

16 

460 

1302 

3 

9 

20 

426 

2188 

5 

25 


Y = 1964 1436 Check :f 1436 = 6(1307 -667) -2(1232) 

Y~ = 667* 

Y+ = 1307 


Fitted law: 


1232, 2£ 2 = 17-6, 

= 6031, 

E?) f £| = -697, 2(<S f -5) 2 = 70624, 

a' = 78-1, 6 = +0-386, 

£ = 530, k = -0-391, 


2 Y\ = 226 9670, 
27, = 718, 

A = 111 2718, 
c' = 462, 

£/ = 156. 


y, = 529-6- 156-1 exp (-0-391Q. 


Example 3. N -fertilizer trial with wheat. Shantung, 1936. Fit of exponential 
law of diminishing returns to mean varietal responses 


Fertilizer 
rate in c./m. 

Response 

Vi = 10 yield 
-3000 

27, 

21, 

4& 

0 

353 

-3179 

-5 

25 

4 

627 

-2199 

-3 

9 

8 

786 

- 786 

-1 

1 

12 

111 

717 

1 

1 

16 

959 

2393 

3 

9 

20 

1093 

4445 

5 

25 

i 

i 


Y = 4635 1391 Check: 1391 = 6(2769- 1766) -2(2313-6) 

y- = 1766 
F+ = 2769 


*ydi = 

2313-5, 

sa- 

17-5, 

S7f = 

103 89 500, 


12982, 



II 

695-6, 

* 


-245, 

2(,S,-S) 2 = 

126230, 

A = 

219 4019, 

a' = 

133, 

6 = 

0-1292, 

c' = 

15-0, 


1786, 

jfc = - 

-0-1294, 


1015. 


Fitted law: Vi = 1786- 1015exp(-0-1294£ i ). 


t It is convenient, when adding the y„ to reoord Y~ = Y,y t for i < 0 and Y + = 'Ey, for » > 0, separately 
(forming Y = Y~+ Y + = Y,y t ). This provides a check on the forming and copying of the Y, from the 
equation E27, = n(y+- Y~) - 2Zy, i t , e.g. in the present case 6(1307 -657) -2(1232) = 1436. 





H. 0. Hartley 39 

We now turn to two well-known regression laws which, by suitable transformation, can 
be reduced to our exponential law. 

(2-2) The internal least square fit of the logistic curve 

The general form of the logistic curve isf 

z = ^/(l + Be- te ). (29) 

In the case where A is known (as for instance with a biological response known to vary 
between 0 and 100 %) the transformation 

log (Ajz—1) = log B—lcx, (30) 

reduces the problem to a plain linear regression fit for log B and k. Nevertheless, this case 
has recently been dealt with (Finney, 1947) by the more elaborate, but under certain con- 
ditions more efficient, method of maximum likelihood. 

We are here dealing with the general case where A has to be estimated from the data. 
In this case the transformation z = 1 jy yields 

y=llA + (BIA)e~**; (31) 

which is of the form (1) or (9). Considerations of appropriateness and efficiency are, again, 
postponed for discussion in §3 below, but we should mention here that Rhodes (1940) also 
uses reciprocals when fitting the logistic by a different method. We are giving in example 4, 
below, the data used by him, viz. Population figures for the U.S. Census Data, 1800-1910. 
It is with such data on population growth that the estimation of the logistic parameters is 
of importance. 

Example 4. U.S. Census population 1800-1910. Fit of logistic curve and 
-estimation of its parameters by internal least squares 


Year 

X 

Population 

(millions) 

z 

y = 10,000/z 

2Y t 

Hi 

4£ 

1800 

6-308 

1884 

-10310 

-11 

121 

10 

7-240 

1381 

- 7045 

- 9 

81 

20 

9-038 


- 4020 

- 7 

49 

30 

12-800 

777 

- 2811 

- 5 

25 

40 

17-009 

580 

- 1448 

- 3 

9 

1850 

23-192 

431 

- 431 

- 1 

1 

00 

31-443 

318 

318 

1 

1 

70 

38*558 

259 

895 

3 

9 

80 

60-150 

199 

1353 

5 

25 

90 

02-948 

159 

1711 

7 

49 

1900 

76-996 

132 

2002 

9 

81 

10 

91-972 

109 

2243 

11 

121 


Y = 7273 
Y~ = 6097 

r+ = 1176 


f In some applications it has been found necessary to generalize the logistio to 

y-C = Al(l + Bexp (-*»)). 

where the lower asymptote O is a fourth parameter to be estimated from the sample. However, this 
oase is comparatively rare. 














40 Non-linear parameters by ‘ Internal Least Squares' 

Following the formulae and procedure (25), we obtain 

o' = -172 16, 6= 0-3074, c' = - 232-40, 

whence P = 50-4, k — — 0-3098, fp = — 323-4. 

By comparison with (31) we obtain from the above parameter estimates of the exponential 
law, the corresponding estimates for the logistic. These are given below, alongside those 
obtained by Rhodes, who used a method which is computationally more cumbersome as it 
requires both the reciprocal, as well as a logarithmio transformation. 

Example 4. U.S. Census data. Comparison of estimates of logistic parameters 
obtained by Rhodes with those obtained by internal regression 


Parameter 

Internal 

regression 

Rhodes' (1940) 
method 

10,000/y 

198 

199 

— k 

0-3098 

0-3127 

-yf 

323 

328 


The agreement is extremely close, which is to be expected as the logistic curve fits the 
data well. In general, the theoretical properties of Rhodes’s method are difficult to assess 
as it consists of two fitting procedures. The first (which iqakin to a lag-1 autoregressive scheme) 
provides the estimates of k and A ; these are then used to transform to linearity (as in (30)), 
so that the estimate of BjA is obtained as a mean of the log transforms. 


(2-3) The internal least square fit of the Makeham-Oompertz curve 
This curve is of fundamental importance in Life Table work. Its general form is 

z — oexp( — 6e _ter ). (32) 

Using the transformation y = log 2, we obtain 

y = log a — be~ kx , (33) 

which is of the form (1) or (9), so that its parameters can be estimated by the same method. 
We intend to deal with this application more fully elsewhere. 


(2-4) The special case of the exponential law with known asymptote ( Markoff chain) 

A particular case of the first order linear difference equation is the case a — 0. In thin 
case, (8) is equivalent to = _ \b{y M + Vi ), (34) 

which is the systematic part of the Markoff chain equation 

Vi+ i = m + v t> (35) 

in which v t represents a random disturbance. In analogy to the general case, we have as 
the solution of (34) ^ _ y* e kx i> 

where k = - 2 tanh £6, y* = Pf. (37) 

Dealing with the case » = odd = 2wt+ 1, retaining the definitions (21), we find by summation 
of ( 34 ) Vt ** —bYj + w, (38) 



H. 0. Hartley 41 

where to is a constant of integration. Obviously (34) and (38) are, again, equivalent equations. 
The least square solutions for b and to in (38) are then given by 

** — bHY\+wYY^, Yt/f = — bYiYf + tore = 0. (39) 

From the second equation it follows that to = bn~ v LY i and hence, from the first equation, that 

-b = s r,ws(r<- F)». (40) 

By partial summation we may transform (40) into 

-b = n-'Xy&XyJXiYt- F) 1 , 

which is our least square estimate of b. It is proportional to the mean of the y and to the 
regression coefficient y on x, whilst it is inversely proportional to the sum of squares of the 
progressive totals Y t of the y t . The computational procedure is simplified as £%£* is no 
longer required. 

3. Large sample theory of the internal least square estimators 

(3-1) The relation between internal least square and ordinary least square estimates 

In this section we compare the basic assumption on which the internal least square method 
is based with that made in the ordinary least square hypothesis. We confine ourselves to 
the special case of an exponential law with x axis as asymptote, as in (36). The general oase 
follows on the same lines. If we denote the residuals of (36) by e u viz. 

e< = y<-y*e kx *, (41) 

then the assumption on which the classical least square fit is based, is that the e { are in- 
dependent and have the same variance. The roots y* and k of, 2e? = minimum, are then 
maximum likelihood estimates. If on the other hand we assume that the residuals of (39), viz. 

&-*+6(j;-7), (42) 

are independent and have the same variance, then the internal least square estimators have 
the maximum likelihood property. Yet another hypothesis can be investigated: 

Analogous to Bartlett’s (1946) treatment of stationary time series in correlogram work, 
we may consider the residuals 

e i = (y<+i -yd+ ¥>(y i+ i +y t ) (43) 

in the difference equation (34). The assumption of independent deviates, 6 if which is the 
analogue to Bartlett’s starting point, would lead in the present case to the estimator 

- ¥> - (2/m - Jlij/Sdfw + Vi?- (44) 

This is unsuitable, as its numerator depends on the first and last observation only. It is 
interesting to note the relation between the e< and £< of (41) and (42). We find: 

= e i + ~ E), (46) 

where E { and E are formed from the e ( on the lines of (21). Since 6 is of the order n~ l and 
therefore small, f it follows that the second term in (46) is small compared with the first, 
particularly near the centre of the range j = 0. Thus the assumption of independent does 
not differ seriously from that of the classical least square method. It is unlikely, therefore, 
that many situations will arise in which the assumption of independent £< can be disproved 


t It must be remembered that Y m is a total of values of y t and hence, from (38), b ~ 2y jny. 



42 


Non-linear parameters by * Internal Least Squares’ 

whilst the hypothesis of independent e< can be accepted. Nevertheless, we give in the next 
section the loss in efficiency resulting from the use of the internal least-square-fit under the 
more usual hypothesis that the e f are independent deviates with equal variance. 


(3*2) The efficiency of the internal least square estimator 
In this section we confine ourselves to the case a = 0, i.e. the exponential law with x axis as 
asymptote. The treatment of the general first order internal regression is on similar lines but 
algebraically more tedious. Further, in order to simplify the formulae we approximate 

+m 

exponential sums of the type h ^ «**• f( x i) with x i+1 —x { = h and » = 2m + 1, by the 

i=» — m 

integrals r+tx 

e kx f(z)dx> 

J-kX 

with X = nh . The error thereby committed is small, provided n is moderate or large. 
The exact summations could be carried out, but are more tedious. 

We first derive the transcendental equations for the ordinary least square estimators and 
derive their variances from the general maximum likelihood formulae. We then derive an 
approximate large sample formula for the variance of the internal least square estimator 6, 
and finally by comparison with the former, its efficiency. 


(3*21) Maximum likelihood results 

Let the range of the independent variate be — IX^x^ X = nh, and let us try to 
fit the expression y ^ __ y* e kx 

to the observed series y(x { ) where x i = ih , h = X/n. Then, by ordinary least square or 
maximum likelihood procedure, we minimize 


r+\x 

L = X~ l (y — y*e kx ) 2 dx, 

J -tx 


(47) 


resulting in the nonlinear system of equations 


dL 

dk 

dL 

dy* 


= X 1 
= X- 3 


r + \X 


[y-y* e kx ) e kx x y* dx = 0, 

J-fx: 

r+i-x 

| (y-y*e kx )e kx dx = 0. 

J- ix 


m 


In order to obtain the variance of the maximum likelihood estimator of k (k say) we must 
form the Hessian 


A - 


43?) *(P 


(49) 


A = 


(50) 


We obtain 

£X-VAr 8 {(2 + q 2 ) (e 8 - e -8 ) - 2 q(efi + e -8 )}, + e -8 ) - (e 8 - e" 8 )} 

JZ -1 yi-%(e« + e~«) - (e 8 - e -8 )}, \X~ x k~ 1 [e’ 1 - e -8 ) 

where q — kX and k is the true parameter. Accordingly we find for the variance of k (which 
is given by the ratio of first minor to n times the determinant) 

Variance k = of 8y* _ ^ 8 » _1 Z~ 2 (e 9 — e _8 )/(4^ a + (e 8 — e -8 ) 2 ), 


(51) 



H. 0. Hartley 


43 


which can be reduced to 

Variance k = o*Sy*- 2 q* n _1 Z _ * sinh g/(oosh 2q-2q*— 1). (52) 

Letting k-+ 0, but keeping X fixed (i.e. letting <? -* 0) we have 

Variance k -*-crJ12y* -a n -1 JC“*. (53)' 

As an independent check on (63) we remember that for g-*0 with y* large and fixed, the 
exponential law (46) will be approximately represented by the line 

y(x) -y* + y*kx. (54) 

The maximum likelihood estimator of y*k, that is the ordinary linear regression coefficient, 
has a variance a , \j'Hx i —x) 1 which, for the present Bample of x t equidistantly spread over 
the interval of length X, tends to <rf 12 j(X z n), thereby confirming (53). 


• (3-22) Internal least square results 

We now turn to the internal least square estimator 6 and study its random sampling 
distribution under the hypothesis that the observed values of y t have expectations 
y* exp (kx t ), from which they differ by independent random deviates e t having equal variances 

viz - y{x i ) = y* exp (kx t ) + e t . (55) 


where 


r+l-x 

We again use the approximation Xn 1 Y i f(x i ) ~ f(x) dx. 

J — i-X 

Using elementary and partial integration and putting kX = q , we reach the following 

results: r+ix y * ) 

Total of y = I y(x) dx = ~r (e* 9 - e-*«) + edx, 

J-tx k J-tx f (66) 

y = X -1 total of y — y*q- 1 (e iq — e~ iq ) + e; 

V = y-y = y'e^ + ky + e-e,) 

where y = — y*(qk) 1 (e* 9 — e - * 9 ) ; j 

Y(x)=\ y(x)dx - y*k~ 1 e kx + \ edx; (58) 

Jo Jo 

_ c x f +t-x r* 

Y— Y = y*k~ 1 e kx + y + I edx — X -1 ed£dx; (59) 

Jo J -tx J o 

r+tx _ r+tx r+tx 

I (Y — Y) 2 dx = A + 2B\ edx — 2 I eC{x)dx; (60) 

J -»x J -tx J -*x 

where A = \y* i k~ i (e q — e -9 ) — y 2 X, 

B = y*k~*(eto + e-to), (61) 

and C{x) — y*k~ i e kx + yx, 

and where the quadratic term in e has been ignored. Similarly we obtain: 

r+tx / r+ix r+ix \ 

J yYdx = kyl + Bj edx + J e(-C(x) + D(x))dxj, (62) 

where D(x) = y*k~ i e kx + k~ l y, 

and where, again, the quadratic term in e has been ignored. We now form our estimator b 
in accordance with (40): f+ix I r+tx 


where 


where 


r+\ 


yYdxi 


(Y — Y) a dx, 



44 Non-linear parameters by ‘ Internal Least Squares' 

.and obtain to within linear terms in e: 


- 6 = *{l + A-' J + **e( - B + C(x) + D(x)) dx j . (64) 

We first note that to within the approximation employed the expectation of — b is k, which 
is the first term in the expansion of - b = 2 tanh 

Next we see that we have a representation of —6 in the form constant + jef(x)dx, i.e* a 

weighted sum of residuals. Remembering now that I efdx is being used as an approximation 

X ^ 

to — e i and using, therefore, the formula 

n 

a +iJc \ f+tx 

ef(x) dx I = o-JXn -1 1 f*(x) dx, (66) 

/*+*x 

we reach Variance b = o^Xnr 1 1 (-B + C(x) + D(x))*dx, (66) 

J-iX 



Fig. 3. Comparison of the variance § of alternative estimators: 

(a) Maximum likelihood (equation 52) 

(b) Internal least square (equation 67) 

§ The variances are standardized by putting <rJ=l,X=l,n=l, 1 


which, after some lengthy algebra, emerges as 


Variance (b) = aj 


nXhf ** 


- 1 + Q j tanh $q + lq coth 

|l_?tanhjg| (e 3 - e -3 ) 


It is easy to see that for q-*- 0, var 6 -> 1 2o*jnX 2 y**, which is the same limit as (63) and 
therefore agrees with both the maximum likelihood value as well as the classical variance 
of the linear regression coefficient. 


f For the order of discrepancy between -b and k see, for instanoe, examples 1 to 4 of Part 2. 



H. O. Hartley 


45 


(3*23) The efficiency of b 

A 

In Fig. 3 we have plotted both the maximum likelihood variance of k as well as that of b. 
For given values of <rj, », X and y*, both variances are functions of q = kX. We have already 
seen that for q -*■ 0 they tend to the same limit, so that b is highly efficient for small which 
was to be expected from the discussion given in § (3* 1 ). As q increases both variances decrease 
rapidly, but the maximum likelihood variance more so and the efficiency of b drops until 
for large q the efficiency tends to 600 jq %. For such large values of q (say q^ 10), when the 
fitted exponential is steep and sharply bent and when y_^ > 10y m> it is doubtful whether the 
basic assumption of uniform variance of the residuals e { over the whole range X is justified. 

The objection that the variance of b depends on q, i.e. the parameter estimated by — bX, 
can of course be overcome (at least approximately) by an appropriate variate transformation 
of b. We shall not enter into this problem here. 


REFERENCES 

Bartlett, M. S. (1946). J. Roy. Statist. Soc. Suppl. 8, 27. 

Crowtheb, E. M. & Yates, F. (1941). Empire J. Exp. Agric. 9, 77. 

Cunningham, L. B. C. & Hynd, W. R. B. (1946). J. Roy. Statist. Soc. Suppl. 8, 62. 

Finney, D. J. (1947). J. Roy. Statist. Soc. Suppl. 9, 46. 

Kendall, M. G. (1944). Biometrika , 33, 105. 

Mann, H. B. & Wald, A. (1943). Econometrica, 11, 173. 

Rhodes, E. C. (1940). J. Roy. Statist. Soc. 103, 362. 

Sasuly, M. (1934). Trend, Analysis of Statistics. Brookings Institution, Washington, D.C. 

f Actually, on substituting in equations (52) and (67) it will be found that near 9=2 the maximum 
likelihood variance is very slightly in excess of that of the internal least square solution, an anomaly 
caused by the approximations on which (67) is based. 



[ 46 ] 


TftE GEOMETRICAL METHOD IN THE 
THEORY OF SAMPLING 

By DAVID FOG, den kgl. Veterinaer- og Landbohojskole , Copenhagen 

For the determination of exact sampling distributions a number of different methods have 
been employed, of which two will be mentioned here: 

(1) The geometrical method, which has been applied with great success especially by 
R. A. Fisher (1915, 1925) and consists in the use of the terminology and results of multi- 
dimensional geometry; 

(2) the analytical method, the main line of which is a change of variables and calculation 
of the corresponding functional determinants. 

The first method appears elegant and short, whereas the second frequently leads to long 
calculations. On the other hand, the view is often expressed that the geometrical method 
arrives at its results too easily, at the cost of the accuracy usually required, and which is 
duly honoured by the analytical method. Probably this view has been affirmed by the 
circumstance that it has proved difficult to translate the geometrical methods and 
considerations so as to enter naturally and smoothly into the usual analytical treatment. 

With regard to the criticism that the geometrical method is less accurate, it should be 
observed that the justification depends entirely on the way in which the multidimensional 
geometry is built up. The construction of the multidimensional geometry may be worked out 
on a purely numerical basis, without introducing a single geometrical axiom, but in such a 
manner that the meaning of the geometrical terms as well as the contents of the geometrical 
formulae are of an entirely numerical nature. The system then rests on the same basis as 
the usual mathematical analysis and is therefore just as precise as the latter. 

This subject will not be discussed further here, but we shall show in the present paper 
how in a number of cases the geometrical method may, with complete preservation of its 
simplicity, be translated into analytical form, so that, without any knowledge of multi- 
dimensional geometry, advantage may be taken of the simple methods to which it has given 
rise. 


Before dealing with the examples we shall recall the concepts of volume and surface- 
content of an n-dimensional sphere. Such a sphere lies in an n-dimensional space and may 
be represented by the equation 



*f+*g+... + :e* = a 2 ; 

(1) 

its interior is the domain 

zf + a +4 < a 2 . 

(2) 

The volume of this domain is determined as the multiple integral 



y n ( a ) = \dx 1 dx 2 ...dx n , 

(3) 


J m J 

Zxi*<a % 

l 


and it is easily shown that 

1 / i \ 7T* n a n 

w -r (i n + iy 

(4) 



David Fog 


47 


( 6 ) 


In order to find the corresponding surface-content S n (a) we consider dV n (a). This denotes 
the volume between two concentric spheres of radii a and a + da; thus dV n (a) — 8 n (a)da. 
We therefore obtain dV 2ir in a n ~ 1 

Sn(0)=_ ^ 

It should be noted that the result (4), even with exclusion of all geometry, has a definite 
meaning as the value of the multiple integral (3). As for (5), from a purely analytical 
point of view, we need only regard it in the following as a definition of S n {a). 

We shall now consider three examples/The first two are quite elementary, and the results 
of all three are well known. Each example will be treated in two ways, in a geometrical and 
in an analytical manner, the latter being the translation of the former into analytical 
language; the essential points are the possibility of this translation and its method of 
working out. 

Example 1. Let x 1 ,x 2 , ...,x n be n mutually independent variables, all normal (0 ,<r), 
i.e. having the frequency function 

1 


p{x} = 


7(2 n)^ 


~ _x 2 ~ 
2rr 2 _ ' 


The frequency function of the set X = (x v x %> . . . , x n ) is then 

n 

i 

' 2<r z 


K = p{X} = 


1 


exp 


Putting 

the formula (7) may also be written 


(V(2tt)<7)» 

<-M 4 


K — 


•[-£]■ 


( 6 ) 


( 7 ) 


( 8 ) 


( 9 ) 


KdX, 


( 10 ) 


(\/(27r)<r)« eXp 

Assuming X to be a point in an n-dimensional space and k — to be the density at 
this point, the probability of X being in a domain oj is equal to the mass of c o and is given 
by the multiple integral j j 

taken throughout <o, where dX — dx x dx 2 ...dx n is an element of volume. 

This in particular comprises the determination of the distribution of an arbitrary statistic 

y = <f>(x v z t ,...,x n ). 

Designating its distribution function as P{y), (IQ) gives dP{y), when forw we put the domain 

y < <f>( x v x 2 , ...,x n ) <y + dy. 

We shall now find the distribution of the statistic q, introduced in (8). This variable may 
be interpreted as the distance from the origin 0 to the point X. Hence we use as element of 
volume the region between two n-dimensional spheres of common centre O and radii q 
and q + dq and obtain = K dV n (q) = KS n (q) dq. 

Inserting the values of k and S n (q) from (9) and (5) we get 


iP{l) - 2K«- » r (in) eXp [~ 


2<r s 


1 dq 

a' 


(H) 


and we have obtained the distribution required. For <r = 1 (11) in particular shows the 
distribution of x- 



48 The geometrical method in the theory of sampling 

Next we will prove (11) without any use of geometry, but following the exposition above 
as olosely as possible. We then start with (9) — -where q is determined by (8) — and 

dJ>{X} * KdX. (12) 

By an almost* one-to-ono correspondence we pass from the variables X = (x v x t , ...,x n ) to 

q, U = u n _ x ), 

the u’s being co-ordinates on the unit-sphere, the specification of whioh is of no importance 
for the proof. t Designating by A the jaoobian d(X)/8(q, U), we get 

dP{q, V) = k | A | dqdU, 


(13) 


from whioh we find dP{q] — tcdq A | dU, 

o) being the domain of the u’s. 

Hence it remains only to determine the multiple integral in (13), which is done as follows: 


From V n {a) = J J 

Y.Xi % <a % 

i 

we obtain by means of the substitution used above 


dX 


V n (a) = J...J | A | dqdU = A I dU, 


thus by differentiation 


dVM 

dq 




Hence the integral desired is equal to 8 n (q), and (13) becomes 

dP{q} = KSJq)dq, 

which, just as before, leads to (11). 

Note. The assumption that the true mean of x is 0 is of little importance. For if this mean 
is equal to p + 0, we have only to consider, instead of x, the deviation x—/i from p. We then 
find that //« \ 

has the distribution (11). A similar modification will apply to the following examples. 

Example 2. As before, let X = (x v x t , ...,«„) be distributed so that (6) and (7) are valid. 
From the x’s we form the linear combination 


z = Ea<*i> 

i 

where for the sake of convenience we may assume that 

ix = i. 

i 


(14) 


( 15 ) 


* The exceptional points forming a domain of volume 0. 
t A suitable transformation is for instance 


x 1 = q cos u t 
x 2 = q sin u t cos u t 


(<I> 0 ), 

(0 < U( < 7T t t = 1, 2, ..., tt-2), 


x n . i = £ % sin u g . . . sin u n „ 2 cos u n _ x (0 < u n „ x < S 

x n — q sin u x sin u t ... sin u H _ % sin 



David Fog 


49 


n 

In Fig. 1, a represents the hyperplane X a i x i = 0, passing through O ; the line l is perpen- 

x 

dicular to a at O. The point X is projeoted on l and a as N and Q; the quadrilateral ONXQ 
is a rectangle. The oo-ordinates of N and Q are easily found to be 

N = (o x 2 , ...,a n z), Q = (a^-o^z, ...,x n -a n z). 

Further ON = z, and for OQ, denoted by q lt we find 


? 2 i=S(*i-o<2) 2 - (16) 

1 

n 

From the right-angled triangle OQX , where OX 2 = 2 we have 

1 

£4 = **+</*, ( 17 ) 

1 


which may also be easily verified directly by calculation. 




Fig. 1. 

By means of (17), we may write (7) as 

'-(7<S)5r-“ tp [-^]- (18) 

We will now find the joint distribution of 2 and q v As element of volume for X we use 
a domain, the projection of which on a is the region between two ( n — 1 )-dimensional spheres 
in a, having the common centre O and the radii q x and q x -f dq v and the projection of which 
on l is the line-element dz. We then get 

dP{z,q x } = KdV n _i(qJdz = K$ n ^{q x )dq x dz. (19) 

Hence by (18) and (5) 


y -»iW- 1 > e,p [ • ~ &] (%r 


Thus z and q x are independent; z is normal (0,<r), and q x is distributed as q in (1 1), but with 
n— 1 instead of n. 

It 

If in particular a x = a a = ... = a n ^= l/yjn, we get, x being the mean 


z = Xyjn, q\ = 'Zfa-x) 2 . 

l 


Biometrika 35 



60 The geometrical method in the theory of sampling 


Let us now, without the aid of geometry, onoe more prove (20), starting from (6), (7), 
(14), (15) and dP{X } = KdX. (21) 

We introduce new variables z,Y — (y v t/ 8 , by means of an orthogonal substitution 

2 = a 1 x l +a#; i +...+a n x n , y t = + o <2 « 2 + ... +a in x n (*=1,2 n-1), (22) 


and put 



(23) 


As (22) leaves sums of squares invariant we get 

= z 2 + q\. 


(24) 


A comparison with (17) shows that the qf s introduced by (23) and (16) are identical. 

The jaoobian corresponding to (22) being ± 1 we get from (21) 

dP{z, Y} = KdzdY, (25) 

where as before k is determined by (18). 

By a transformation analogous to the one on p. 48, but with n— 1 instead of n, we may 
instead of Y introduce new variables (q 1 ,u 1 ,u 2 , ...,« w _ a ). After integration relative to the 


u’s we have 


dP{z,q i} = K dzS H _i(qi)dq^, 


which is equivalent to (19) and immediately leads to (20). 

Example 3. As a final example we consider the normal correlation between k variables. 
In the geometrical treatment we use a method employed by Wishart ( 1 928) for the derivation 
of the distribution named after him, with certain simplifications, however, which will 
facilitate the transition to the analytical exposition. Furthermore, for convenience, we 
assume in the following k = 3 ; the generalization from & = 3 to an arbitrary value should cause 
no difficulties, the addition of one more variable merely introducing another step of reduction . 

Let x , y, z denote three normally correlated variables. Assuming that the three true means 
are equal to 0, the frequency function may be written 

JA 

p{x, y, z} = ^-,exp [ - 4(« u * 2 + 2a lv xy + ...+ a 33 z 2 )], (26) 

where A is the determinant \a flv \. We consider n triples (x^y^z^, mutually independent 
and each satisfying (26). We introduce the designations 


n 


n 


n 


hi ~ S 35 ?* hz ~ x iVii ••■> hz — ijZ 2 . (27) 

l l l 

Putting furthermore in the same way as before 

X = (x v x s> . ..,*„), Y = {y v y t ,...,y n ), Z = (s^z* zj, (28) 

At” 

we get K=p{X,Y,Z} = ^)P ex P f “ i( a uhi + + • • • + (29) 

In the following our aim is to find the distribution of the quantities Z U ,Z 12 , ...,l 33 . 
k in (29) denotes the density at the point 

M = («,,*„ y v y t ,...,y n ,z 1 ,z 2 ,...,z n ) (30) 


in a 3n -dimensional space; the origin is denoted 0. Within this space we consider three 
n -dimensional subspaces, corresponding to the first n, the intermediate n, and the last n 
co-ordinates. The projections of M on these three subspaces are the points X, Y, and Z in (28) 



David Fog 51 

in the sense that for each point only the n co-ordinates belonging to the corresponding 
subspace are indicated (the other 2 n being zero). 

For the time being it will be convenient to imagine that Y and Z, with the co-ordinates 
shown in (28), are placed in the same n-dimensional subspace as X. Thereby we are, inter alia, 
enabled to give a simple geometrical interpretation of all the I’b. Thus l u means the square 
of the distance OX, while l u means the product OX. OY . cos 0 ia , where <f> 12 denotes the 
angle XOY, etc. 

In the n-dimensional space just mentioned we introduce a new co-ordinate system with 
the same origin O and in close relation to the points X, Y, and Z (Fig. 2): the a^-axis is 
chosen along OX, the a: 8 -axis perpendicular to OX in the plane OXY, and the x,-axi» 
perpendicular to the plane OXY in the three-dimensional space OXYZ. The remaining 
n — 3 axes may be chosen arbitrarily (subject only to the conditions of orthogonality). 



The new co-ordinates of the points X, Y and Z are denoted as in Fig. 2 (where only the 
first three are shown, the remaining being zero). Further we may assume that the change of 
co-ordinates is carried out in such a manner that £1 > 0, ?/., > 0 and £ 8 > 0. 

The V s being invariant by change of co-ordinates and hence expressed in the same manner 
by the new and the old co-ordinates, we have 

*n — £?i *12 = £iVv hz — £i£n 

*22 = Vi + V\i ha = Vi£i + Va£a > ! (31) 

*83 = £?+£!+ J 

In relation to any set of points X, Y, Z in the n-dimensional space we may introduce a 
new co-ordinate system in the above manner and determine the six characteristic quantities 
£i> Vv Va> 6n £» £s- To begin with the distribution of these six quantities is sought. 

With that in view we consider such sets of points X, Y, Z, for which the six quantities 
have fixed values and where, moreover, the points X and Y are fixed. The projection 
Z 0 = (&,£*, 0) of Z on the plane OXY will also be fixed, and the distance of Z from the 
plane has the oonstant value £ s . The point Z, therefore, will be able to move on the surface 
of an (n — 2)-dimensionaI sphere of centre Z 0 and radius £ s , situated in an (n— 2)-dimensional 
space perpendicular to the plane OX Y in Z 0 . To Z we attach (see Fig. 2) a small element of 


4-2 



52 The geometrical method in the theory of sampling 

volume d£xdf a d£ a . When Z moves on its spherical surface this element of volume will describe 
an »-dimensional volume (32) 

Next we consider such sets of points X , Y t where £ 1# r/ 1 and ?/ 2 have fixed values, and where 
moreover X is fixed. The point Y will then be able to move on an (n — 1) -dimensional sphere 
of radius i] 2 and centre in the projection of Y on OX ; a small element of area df^dri a attached 
to Y will thereby describe a volume 

fin-ilVt) dV\di, t . (33) 

Finally, when X moves in such a manner that the value of is fixed, then a line-element 
d£ x attached to X will describe an element of volume 

$n(Zi)d£ v (34) 

We now return to the point M in (30), the projections of which on three mutually ortho- 

gonal, n-dimensional subspaces are X, Y and Z in their original positions. To a given set of 
values r/ v £ lt £ a , £ 3 with corresponding differentials d£ 1? dr/ v drj 2 , d£ v d£ a , d£ 8 we obtain 
in the 3n-dimensional space an element of volume equal to the product of the three elements 
(32), (33) and (34), i.e. 

)Sn- i (Qd£ i d Vl d Vz d£ l d£ i dS 3 . (35) 

Hence we have 

dP{i^y,vMM = *«,.(£i) *.-ifo.) S,_,(C.> ^drj^d^d^ (36) 

where k is given by (29). Thus we have found the required distribution of £q, ij l , // 2 . f j, £ 2 , C 3 , 
as by means of the formulae (31) these six quantities may be inserted in k. 

Substituting the values of the spherical surfaces in (36) we get 

, m , 27r*"£r 1 2* i< "'- , >i?r 2 27r«" , Ar 

dP&vVv •••»&} - *' r( j n ) ' rb(n-l) " T\(n-‘l) d ^ d>h "' d ^' 

gffiOn 3) 

T ( \n ) T f( n^T j r J(n - 2 ) 5 

We shall now find the distribution of the Ts introduced in (27). The jacobian of the I'h in 
relation to r/ v . . . , £ s may be split up into a product of three and becomes 

it o o 

Vi V2 9 = 

% 


thus 


dP{^l,3/i, ...,Q = K = 


Er 1 >ir 2 cr s ^ 1 d>i l ...d^. 


(37) 


9((| 2 , ^2) ^(^18’ ^ 23> ^33) _ O £- 


0 


Hence we get 

dP{l n , / 12 , • • • > ^ 33 } 585 K 1 


n t(Sn ■ 3) 


(ZiViZs)" ^dl 11 dl 12 .. . rf/; 


T(Jn) r|(n— 1) r|(n — 2) 

Moreover, by means of the rule of multiplying determinants we have 

!2 


33* 


(38) 


^11 

l\2 

hs 


Ci 

0 

0 

Z 2 i 

I22 

^23 



V2 

0 

^31 

^32 

^33 


Ci 

£2 

£3 


~ (ZlifaZz) 2 ’ 


whence (38) becomes 
dP{ln, li2i •••> ^33) ~ 


n K*n-a) 


T (in) r*(n-l)r*(n-2) 

This is the required distribution of the Z’s, as k is expressed by the Z’s in (29). 


ill 

I'Vl 

^13 

^21 

l<22 


^31 

^32 

^33 


Kw-4) 


dl^dl^2 * * * d^i 


33* 


(39) 



David Fog 


53 


We shall now repeat the proof in a manner whieh will not involve any geometrical aids. 
To do this, we proceed once more from (26) and introduce the V s by (27). Furthermore, we 
introduce X, Y and Z by (28); then 


dP{X, Y, Z) = KdXdYdZ, (40) 

where k is determined by (29). We may for the sake of convenience continue to use the words 
‘point’ and ‘co-ordinate’. 

We consider an orthogonal transformation 

(I) 

t'i = a,-!*! + ...+tx in t„ i = 2, 3, .... n, 
carrying (t v t 2 , ...,/„) into Applied to the point X it gives 


X' -(£ x , 0,0, 


<>). 


all the co-ordinates of X' except the first being equal to zero owing to the conditions of 
orthogonality ... + *,„*. _ „ (» _ 2, 3 »). 


By (I) the points Y and Z are transformed into 


Y' = foi.yj, z ' = (&. 4 •••><), 


where the co-ordinates need not be calculated. The special symbols tj 1 and are due to the 
fact that these co-ordinates will not change by the two following transformations. 

Next we use a new orthogonal transformation 


ti = t\, 

y'n <'« > V2 « J(i y ;*) , 

t'i ~ + PinKi * = 3 , • • • j n » - 


(IT) 


This transformation does not change A"', but caiTies Y f and Z r into 

Y = (Vv •• •> Yi = (Cv Z Ti)‘ 

7f l and ^ remain unchanged as noted above. The last — 1 co-ordinates of Z” need not be 
calculated. The special symbol £ 2 is due to the fact that this quantity is unchanged by the 
following transformation. 

Finally, we use a third orthogonal transformation 

h - r i> I 

f 2 ~~ l 2> I 

t% = ^(z' a tz+...+z;,o & = (III) 

<r = r^+-+y 1 »C *-*.5, ...... 


This transformation changes neither X' nor Y", but carries Z" into 

Z w = (£ x >£*>fa,0,...,0). 

Thus we have once more introduced the quantities £ x , ij lt t] 2 , £ a , £ s , known from the 

geometrical treatment, and as the Vs are invariant by orthogonal transformations we again 



64 


The geometrical method in the theory of sampling 


We now return to (40) and apply some transformations of variables: 

(1) For Y and Z we introduce Y' and Z', respectively, whereas X is left unchanged. As 

the coefficients of (I) depend only on X, the jacobians are equal to ± 1, and (40) is trans- 
formed into dp{x ^ y , z >j = Kd Xd Vl dY'd^dZi (41) 

where Y' x = (y 2 , ...,y' n ) and Z[ = (z 2 , ...,z' n ). 

(2) For Z’ x we introduce the last n— 1 co-ordinates from Z’, while X, ij v and FJ as well 
as are unchanged. As the coefficients of (II) depend only on FJ the jaoobian as above is 
equal to ± 1, and (41) becomes 

dP{X, Vv Fi, = KdXd^dYld&d&dZl (42) 

where Z 2 = (z£, .-., 2 *). 

According t-o (31) a: may be expressed by the variables £ x , i/ v rj 2 , £„ £„ £a only. Hence, as in 

example 1, instead of X we may introduce £, x = / (2 2$) together with U = (u v u 2 , ...,u n „ l ), 

f/n \ V\1 / 

for FI similarly t] a = / 1 S 2/? ) together with V - (v v v v n _ 2 ) and finally for Z 2 , 

lt n \ } 

= /IX 2,- 2 1 and W = (w v w 2 , ...,te„_ 8 ) as new variables. Eliminating f/, F and IF by 
integration we get from (42) 

dP{£i, Vi* Va< £i> Q = K ^n(&i) Sn-iiVa) &n- 2(^3) ^Vid^i d^ 2 d£, s . 

Thus we have once more proved (36) and may as before obtain (37) as well as (39). 

n n 

Just as we have seen in example 2 that 2 is distributed as 2^5* but with »— 1 

1 3 

instead of n, it may be shown that the variables 


n n 

hi = 2 (*< - *) 2 > ^12 = £(*<- *) - y). 

1 1 


j" 8 3=S(2i-2) J 

1 


are distributed as the Ts in (27), but with n— 1 instead of n, y and z being the means of 
x { , y i and z i respectively. The proof is omitted here. 

Finally, it should be noted that the constants of the distributions dealt with in the above 
all enter automatically and need not be determined by a special process. The common 
source of all these constants is formula (4), giving the volume of the n-dimensional sphere. 


REFERENCES 

Fisheb, R. A. (1915). Biometrika, 10, 507. 
Fisher, R. A. (1925). Metron , 5, 90. 
Wishart, J. (1928). Biometrika , 20A, 32. 



[ 55 ] 


PROOFS OF THE DISTRIBUTION LAW OF THE SECOND 
ORDER MOMENT STATISTICS 

By JOHN WISHART, School of Agriculture , Cambridge 

The foregoing paper by Prof Fog (1948) calls to mind the different occasions on which proofs 
have been derived, by various methods, of the result first given twenty years ago by the 
present author (Wishart, 1928). It may be useful at this stage to catalogue these proofs, so 
far as known to the writer, and to comment on the contrasting methods employed. 

The method first used was a direct extension of the geometrical method of approach used 
by R. A. Fisher (1915), and was worked out in full for three variates, then extended to the 
general case by the use of quadratic co-ordinates. At the end of 1933 Prof. Mahalanobis sent 
the author a somewhat fuller proof on the same lines, which was published some years later 
(Mahalanobis, Bose & Roy, 1937) as part of a long study of the normalization of statistical 
variates and the use of rectangular co-ordinates in the theory of sampling distributions. 
A proof on entirely different lines had been published before the communication referred to 
from Prof. Malahanobis was received (Wishart & Bartlett, 1 933). This depended upon working 
out the characteristic function of the distribution and then using a generalization of the 
Fourier integral theorem to express the result as a multiple integral, detailed consideration 
to which was given by Ingham (1933). Special cases of this method, and of the resulting 
integrals, had previously been considered for two variables by Romanovsky (1925). 

Part of the difficulty, as in all similar problems in the theory of sampling distributions, 
resides in the fact that the practical problem requires the moment statistics to be calculated 
from the sample means. This general feature, which is allowed for by writing n — 1 for n , 
where n is the size of the sample, may be separated from the fundamental distribution 
problem, which is that of the simultaneous variation of the \k{k + 1) sums of squares and 
products of k variate values x { (i = 1,2, ... k), which may be taken to follow a multivariate 
normal distribution with zero means. This was recognized by P. L. Hsu (1939), who obtained 
a proof by the method of mathematical induction which has the merit of being short. 
Madow (1938) deduced the distribution by generalizing from Hotelling’s joint distribution 
of sample correlation coefficients. Of the text-book proofs those of Wilks (1943) and 
Kendall (1946) are geometrical, and that of Cramer (1946) determines the characteristic 
function. Cram6r also refers to a paper by Simonsen (1944, 1945). 

The distribution is a direct generalization of the x l distribution to the case of vectors with 
a number k (> 1) of components. Reference should also be made to Bartlett (1933) who 
deduced the corresponding partial distribution when a number of variates, about whose 
distribution nothing need be assumed, are eliminated, and to Anderson & Girshick (1944; 
Anderson, 1946) who deduced cases of the corresponding non-central distribution analogous 
to the similar problem in the case of x 2 . Recently Elfving (1947) has used matrix methods 
to deduce Bartlett’s decomposition theorem (Bartlett, 1933) and states that the general 
distribution can be proved in this way, although a formal proof is not given. 

Prof. Fog’s proof, which provides an interesting parallelism between the geometrical and 
analytical approaches to the problem, is only given for the case of three variates, but in 
contradistinction to former proofs his steps would seem to be such as to justify his statement 
that the generalization to an arbitrary value of k should cause no difficulties. It may not 
be out of place here to remark that the methods used by him in Examples 1 and 2 may be 



56 


Proofs of the distribution law - 

used to give a geometrical interpretation to certain special orthogonal transformations 
which have been used in deriving tests of significance. It is true that the variables U need 

n 

not be particularly specified, but it helps the elementary student if, for example, 2 (* - *) a 

n * 

and, in the regression problem, 2 (y - a — b(x - x)) 2 can be directly expressed in a convenient 

1 

way as the sums of squares of ft — 1 and ft — 2 independent normal variates respectively. 
A recent expository paper (Wishart, 1947) deduced all the results required for the proof of 
the variance ratio distribution by using the orthogonal transformation 

u t = ( * = i, 2, i), 

u n = Jn.x n , 

i n n - 1 

where x v x 2i ... ,x n are normal (0, 1) and x 4 = 2 (s r )/i. Then 2 (a?< — 5?J* = 2 ( w f)- Similarly, 

1 1 1 

for a regression relationship of the form Y = bx we can make the transformation (Vajda 

194,>) ' Vi = - (i = 1,2, — 1), 

^ n 

where y v y 2i ..., y n are normal (0, 1), the x t can be regarded as fixed, and 

t>i=k (x#,)/Ei, 2* - 2 (*». 

r«» 1 r»l 

Then Eto-W-SMb 

i i 

In both cases (the first is a particular case of the second, in which x i is put equal to unity 
for all i), the figure which applies is Fig. 1 of Prof. Fog's paper. OQ 2 = q\ represents Y(x — x) 2 
in the one case and 2(j/ — bx) 2 in the other, since the linear relations are L(dC/^n) = yjn.x 
and 'L(yxl*J'L n ) = ^/2 n . b respectively, and this quantity is therefore the sum of the squares 
of (ft-1) perpendicular distances from Q on to (ft-1) hyperplanes through the origin 
orthogonal to a, and mutually orthogonal among themselves. Any one of an infinity of 
planes can be chosen for the first, and this corresponds to the statement that the variables 
U need not be specified. In our first illustration the plane is chosen as x x = x 2 , followed by 
x x + x 2 — 2x z , etc., whereas in the second illustration we start with x 2 y x = x x y 2 (where y x 
and y 2 are the variables, and the x’s are constants) and follow up with 

xsix&i + x*!/*) - ys( x i + *a)> etc. 

Finally, the distributions required are easily deduced from the fact that the joint probability, 
after transformation, breaks up into the product of independent parts. 

The more general regression relationship Y = a + b(x-x) is dealt with by applying the 
first transformation to both the x’a and y' s, reducing the sums of n squares and products 
to the sums of (ft — 1) squares and products of new variables u and u \ To u' is now applied 
the second transformation (using ft) to a variable v\ in which i goes from 1 to w — 2, and 

v n-l — V^n-l * b n -ly 

where 2„-i = 2 6«-i = 2 KO/2 n -i- 

r-1 r-1 


where 


We then have 


i(y-Y) 2 = sW) 

i i 


(Wishart, 1948). This represents a particular application of Fisher’s general theorem (1925). 



John Wishart 57 

Geometrically, the position is as represented in the diagram, which is a further development 
of Prof. Fog’s Fig. 1. The first relation 

n 

£ (y/V w ) = *Jn.y projects the point Y on to 

plane I through the origin in the point 
Q(y\.-y>-->y n -y), and OY*=z\+q\, 
where z 1 = y Jn.y and q t = the 

n-1 

second relation 2 = V 2 «- i b 

projects Q on to a plane II through the 
origin (orthogonal to plane I) in the point 
Q'(yi-Yv ••■>yn~ Y n)< and q\ = z\+q% 
where z 2 = N / s »-i A and q t = ,/[]% - F)®]. 

Q' is confined to the (n - 2) space deter- 
mined by the intersection of planes I and II . 

Altogether OY* = z\ + z\ + q\, and q\ can 
then be determined as the sum of the squares 
of (w — 2) perpendicular distances from Q' on 
to (n- 2) other hyperplanes through the 
origin orthogonal to plane II, and mutually 
orthogonal among themselves. Again an 
infinity of planes can be chosen for the first, 
and the one we take is u 2 u[ = MjM.', which 
can if desired be expressed in terms of the 
original variables x and y as (£ 2 -x 3 )y l Fig. 1. 

+ (*3 ~ *i) .Va + (*i - x-i) y a = 0- The mutual independence of the distributions of y*Jn, 
b y["(£ — .i ; ) 2 J and Y,(y— l ) 2 , and their separate distribution functions, then follow. 

REFERENCES 

Anderson, T. W. (1946). Ann. Math. Statist. 17, 409. 

Anderson, T. W. & Gikshick, M. A. (1944). Ann. Math. Statist. 15, 345. 

Bartlett, M. S. (1933). Proc. Roy. Soc. Edinb. 53, 260. 

Cramer, H. (1946). Mathematical Methods of Statistics. Princeton University Press 
Elfving, G. (1947). Skand. AktuarTidskr. 30, 56. 

Fisher, R. A. (1915). Biomctrika, 10, 507. 

Fisher, R. A. (1925). Metron . 5, no. 3, p. 90. 

Fog, D. (1948). Biomctrika, 35, 46. 

Hsu, P. L. (1939). Proc. Camb. Phil. Soc. 35, 336. 

Ingham, A. E. (1933). Proc. Camb. Phil. Soc. 29, 270. 

Kendall, M. G. (1946). The Advanced Theory of Statistics, vol. n. London: Griffin and Co 
Madow, W. G. (1938). Trans. Amer. Math. Soc. 44, 454. 

-Mahalanobis, P. C., Bose, R. C. & Roy, S. N. (1937). Sankhyd, 3, 1. 

Romanovsky, V. (1925). Metron, 5, no. 4, p. 3. 

Simonsen, W. (1944, 1945). Skand. AktuarTidskr. 27, 235; 28, 20. 

Vajda, S. (1945). Ann. Math. Statist. 16, 381. 

Wilks, S. S. (1943). Mathematical Statistics. Princeton University Press. 

Wishart, J. (1928). Bioinetrika, 20A, 32. 

Wishart, J. (1947). J. Inst. Actu. Stud. Soc. 7, 98. 

Wishart, J. (1948). J. Inst. Actu. Stud. Soc. 8, 38. 

Wishart, J. & Bartlett, M. S. (1933). Proc. Camb. Phil. Soc. 29, 260. 



[ 68 ] 


TESTS OF SIGNIFICANCE IN MULTIVARIATE ANALYSIS 

» 

By C. RADHAKRISHNA RAO, King' a College, Cambridge 


CONTENTS 

I* AGE 

1. Introduction 68 

2. Tests with discriminant functions 

(a) Two fundamental distributions 69 

(b) Problems of a single sample HO 

(c) Mahalanobis’ D* and problems of two samples 63 

(d) Test for the equality of discriminant functions 66 

3. Generalization of D* and the large sample theory for several groups . 66 

4. Tests with Wilks' A criterion 

(a) Analysis of dispersion and the theoretical aspects of the A criterion . 67 

(b) The distribution of A and its practical use 69 

(c) Test of differences in mean values for several populations . .71 

(d) Internal analysis of a set of variates 73 

(e) Barnard’s problem of secular variations in skull characters . .74 

References 78 


1. Introduction 

Attempts have been made in recent years to generalize the univariate analysis of variance 
technique to the case of multiple variates. The extension of the theory has been slow and 
only a few methods have been made available for practical use. The starting-point of these 
researches is the simultaneous sampling distribution of the variances and covariances in 
samples from a multivariate normal population given by Wishart in 1928. A few years later 
Hotelling (1931) found the distribution of a quantity T which is a natural extension of 
Student’s distribution to a sample from a multivariate normal population. 

Wilks (1932) following the likelihood ratio method (Neyman & Pearson, 1 928, 1931; Pearson 
& Neyman, 1930) obtained suitable generalizations in the analysis of variance applicable 
to several variables. The statistic A proposed by him has been found useful in a variety of 
problems. Bartlett (1934) applied it for testing the significance of treatments with respect 
to two variables in a varietal trial and indicated its general use in multivariate tests of 
significance. Wilks (1935) and Hotelling (1935) found it useful in testing the independence 
of several groups of variates. Recently Plackett (1947) provided an exact test for judging 
the equality of variances and covariances of various populations, with the use of this 
criterion. Wilks’ statistic supplied some of the basic tests in multivariate analysis but the 
problem of tabulation has not been tackled except in some limited cases (Wald & Brookner, 
1941). A very useful approximation has been suggested by Bartlett (1938) who further 
demonstrated its use in a paper on ‘Multivariate Analysis’ read before the Royal Statistical 
Society in May 1947. 

A new line of research was initiated by Fisher (1936) with his introduction of the dis- 
criminant function analysis. It has been shown (Barnard, 1935; Martin, 1935; Fairfield 
Smith, 1936) that a set of multiple measurements may be used to provide a discriminant 
function Unear in the observations having the property that, better than any other linear 
function, it will discriminate between any two chosen classes such as taxonomic species, the 



C. Radhakrishna Rao 


59 

two sexes and so on. It has also been shown (Welch, 1939; Rao, 1947) that the linear 
discriminant function chosen as stated above is the best among ail classes of functions not 
necessarily linear in the ^et of observations. 

The introduction of the discriminant function led to a new method of deriving test criteria 
suitable for multiple variates. The problem is reduced to the case of a single variate by using 
a linear compound of the several variables, where the compounding coefficients are chosen 
so as to maximize the value of a statistic suitable for a single variate. The application of 
this method to test the differences in mean values for several groups gave rise to the theory 
of canonical roots of determinantal equations (Roy, 1939; Fisher, 1939; Hsu, 19396). The 
distribution of the individual roots and the exact nature of tests require further study. 
Wilks’ statistic, which is a symmetric function of the canonical roots, may be con- 
sidered as providing an overall test of the hypothesis concerned. 

The object of the present paper is twofold. The first is to develop a unified approach to the 
problem of tests of significance in multivariate analysis. The concept of analysis of dispersion 
explained in §4 (a), which is a natural extension of the univariate analysis of variance, has 
been found useful in discussing multivariate problems. In a recent abstract, Hotelling (1947) 
developed a method of splitting a generalized measure of dispersion which appears to be 
different from the method proposed here. 

The second is to examine the nature of Bartlett’s (1938) approximation and supply 
appropriate methods for cases needing the exact evaluation of the probabilities. 

In presenting the various tests of significance developed in this paper it has been found 
convenient to consider the problems arising out of a single sample and two samples in the 
first stage. They depend on simple tests of significance requiring the use of variance ratio 
tables alone and are of very great importance in practice. The use of Wilks’ statistic in multi- 
variate analysis involving more than two samples is considered in the second stage. A number 
of examples have been worked out to explain the computational procedure. 


2. Tests with discriminant functions 
(a) T'tvo fundamental distributions 

The method of discriminant functions to derive test criteria has been found extremely useful 
in multivariate analysis. The problem is reduced to the case of a single variable by choosing 
a linear compound of the variables and constructing a statistic suitable for the case of a 
single variate. The maximized value of this statistic obtained by a suitable choice of the 
compounding coefficients is taken as the appropriate test criterion. The distribution of the 
statistics thus derived in problems involving a single sample and two samples depend on 
the two fundamental distributions considered below. 

Let (ufy), i,j = 1, 2, ...,p be the matrix giving the estimates, on n degrees of freedom, of 
the elements in the dispersion matrix (a^) of p normally correlated variables. The definition 
of Wy implies that it has been calculated from a certain sum of products by dividing by the 
appropriate degrees of freedom. Let d v d v ...,d p be p normal variates with the same dis- 
persion matrix (og) but distributed independently of the w { f s. Considering only the first 
r variables d l 9 ...,d r the statistic V r is defined as 

i-li-l 


( 2 - 1 ) 



60 


Tests of significance in multivariate analysis 


where (wp) is the matrix reciprocal to (w i} ) t i,j — 1,2 r. An alternative method of 

calculating this statistic is provided by the equation 


1 +nV r = 


Mfy "I - d j | 


\ w ii\ 


( i,j = 1,2, 


It has been shown by Hotelling (1931) that, when the w i} follow Wishart’s distribution (Wis- 
hart, 1928) with n degrees of freedom and E(d l ) = E(d t ) = ... = E(d r ) — 0, the statistic 
V r (n + 1 - r)/r can be used as a variance ratio with r and {n + 1 - r) degrees of freedom. 

The author has shown elsewhere (Rao, 19466) that, if d r+1 , ...,d p are distributed indepen- 
dently of d v ...,d r and E(d r+1 ) — ... == E(d p ) = 0, E(d i ) being not necessarily zero when 
i = 1, 2, ...,r, the statistic , „ 

p.itL-s +5-i (2 .2) 

P-r |l +r, | 


can be used as a variance ratio with (p — r) and (n + 1 —p) degrees of freedom. The statistic 
V p is calculated from the formula (2-1) by using all the p variates. 

All the tests of significance considered in this section depend on the use of the statistics 
defined in (2-1) and (2-2). 

(6) Problems of a single sample 

Student’s test connected with pairs of observations admits generalization in two directions. 
The first is to test whether the means of p correlated variables are the same on the basis 
of a sample of size N from a p-variate population. When the test shows differences in mean 
values, there arises the question of deciding whether an assigned contrast involving the 
p variates differs from the best contrast as determined from the data. 

If x u , x 2i , ...,x pi are the observations on the ith individual, then they may be replaced 
by a linear compound z i — + . . . + l p where the Vs satisfy the condition + . . . + l p = 0. 

The problem of determining the best contrast reduces to that of determining the com- 
pounding coefficients l v such that the ratio of mean z to standard deviation of z is 
a maximum. An alternative method which has some practical advantage is as follows. 

By arbitrary choice of constants one can construct (p— 1) linear combinations of the 
variables x v ...,x p9 ^ = m lj x 1 + ... + m pJ x p , 

such that = 0 for j ~ 1, 2, ..., (p — 1). Choosing a linear compound of the ar’s with 

i 

coefficients adding to zero is the same as choosing a linear compound of the y ' s without any 
restriction on the compounding coefficients. If the linear compound is 

A 1 3/ 1 -t-A 2 y 2 + ••• 

then the quantity to be maximized is 

SEA 
1 * 

where w if = ■^— 1 £{y ir -y t ) (y^-yf). 

Observing that only the ratios of A’s are uniquely determinable the equations giving A’s 

may be written as A 1 w l< +... + Ap_ t w p _ u = y t (i . 1,2 (p-1)), 

with the solution as a, = w li y x + . . . + uP~ u y p ^ (i = 1 , 2, . . . , (p - 1 )), 

where the matrix (w*i) is reciprocal to (w {j ). This supplies the best linear compound of the 
y' s which on transformation to the x’s gives the best contrast determinable from the data. 



C. Radhakriskna Rao 


61 


The maximum value of v is given by 

= YLw i *y i y i . 

If V pmml = N(LIitv i 1y i y j )l(N — 1 ), then, on the hypothesis that all the x'a have the same mean 
value, the conditions required for the use of the statistic (2*1) are satisfied so that 

v^N-p+iyfr-i) 

can be used as the variance ratio with (p - 1) and (N—p+ 1) degrees of freedom to test the 
above hypothesis. 

The statistic V p _ x is invariant for all sets of coefficients chosen to construct the y' s from 
the x'a so that in any practical problem either conveniently or conventionally chosen linear 
contrasts of the x'a may be used to define the y’s. 

The test used above is essentially the one given by Hotelling (1931). The point of interest 
is to show how the test is derivable by the method of discriminant functions involving some 
restrictions on the compounding coefficients. Also the author is not aware whether the use 
of Hotelling’s T in testing the equality of means of p correlated variables has been explicitly 
mentioned anywhere. 

To test whether the best contrast as determined from the data is in agreement with an 
assigned contrast x x + ... + £, p x p or rj x y x + ... + y p ^i y p . x in terms of the y’a one may proceed 
as follows. 

The appropriate statistic for testing the significance of the assigned contrast is 

V = + •:•_+¥*. itip-i ) 2 

1 (iV-l) 

where V X (N — 1 ) is the variance ratio with 1 and (N — 1 ) degrees of freedom. The appropriate 
statistic for all the (p — l) contrasts is V p _ x as considered before. The hypothesis specifies 
that all contrasts orthogonal to the assigned one have zero mean so that the conditions for 
the use of the statistic (2*2) are satisfied. Hence 

77 — *) P+V' _ 1 

“ (J T- 2)11+^ 

can be used as the variance ratio with (p — 2) and (JV — p+ 1) degrees of freedom to test the 
above hypothesis. 

The above test can be generalized to answer the problem whether a set of k assigned 
contrasts contain the best contrast. In this case the statistic 


f j _ (A T — p + 1) (1 + __ 1 \ 

(p-fc-1) l i+Ii J 

can be used as the variance ratio with (p - k— 1 ) and (N —p + 1) degrees of freedom. 

The second generalization of Student’s t is concerned with testing, on the basis of a sample 
of size N from a 2p-variate population containing the variables y v ..., y 2p whether 
= E(y { + p ) for i = 1, 2, 

From the 2 p variates y v . . . , y %p one can construct the p variates z i = y t - y i+p , i = 1 , 2, . . . , p 
in which case the problem reduces to that of testing the hypothesis E(z t ) = 0, i = 1, 2, ...,p. 
The variance ratio with p and (N — p) degrees of freedom to test the above hypothesis is 


N(N--p) 

W- 1) 


22u0 






where (w ij ) is reciprocal to (w^) giving the estimates of the variances and covariances of 
the z's. 



62 Tests of significance in multivariate analysis 

This test may be useful in biometry where the asymmetry of organisms is considered. 
The sets of variables ...,y p and y p+1 , . . . , y 2p will then correspond to measurements on 
the right and left sides of an organism. 

Example 1. The data of Table 1 consist of weights of cork borings taken by the author 
from the north (N), east (E), south (S) and west (W) directions of the trunk for 28 trees in 
a block of plantations. The problem is to test whether the bark deposit varies in thickness 
and hence in weight in the four directions. It was suggested by Prof. Mahalanobis that the 
bark deposit is likely to be uniform in N and 8 directions and also uniform but less in E and 
W directions, so that N - E — W + S can be taken as the best contrast. This can, however, 
be tested from the given data as shown below. 


Table 1 . Weights of cork borings (in centigrams) in the four directions for 28 trees 


N 

E 

8 

W 

N 

— 

E 

S 

W 

72 

66 

76 

77 

91 

79 

100 

75 

60 

63 

66 

63 

56 

68 

47 

50 

66 

57 

64 • 

58 

79 

65 

70 

61 

41 

29 

36 

38 

81 

80 

68 

68 

32 

32 

35 

36 

78 

55 

67 

60 

30 

35 

34 

26 

46 

38 

37 

38 

39 

39 

31 

27 

39 

35 

34 

37 

42 

43 

31 

25 

32 

30 

30 

32 

37 

40 

31 

25 

60 

50 

67 

54 

33 

29 

27 

36 

35 

37 

48 

39 

32 

30 

34 

28 

39 

36 

39 

31 

63 

45 

74 

63 

50 

34 

37 

40 

54 

46 

60 

52 

43 

37 

39 

50 

47 

51 

52 

43 

48 

54 

57 

43 


It has been found in similar studies that there exists a significant correlation between 
contrasts such as (N — E) and (S — W) so that the method of fitting constants for the four 
directions and the individual trees by the method of least squares is not appropriate. 
The three contrasts arising out of the four weights may, however, be treated as thrco 
correlated variables in which case the theory developed above is applicable. 

It is interesting to observe that the individual weights in Table 1 are exceedingly asym- 
metrically distributed.* This does not, however, invalidate the test so long as the contrasts 
are normally distributed. In fact, the distribution of the individual weights depends on the 
nature of the plants and the variation between plants. If the above condition is satisfied, it is 
not necessary that the individual weights should follow any distribution law of the known 
type. It may be sometimes necessary to make a transformation (such as log, square or 
cube root) of the variables under consideration to ensure that the contrasts of the trans- 
formed variables are symmetrically distributed if the contrasts of the original variables 
are not so. 

As observed earlier the contrasts may be conveniently or conventionally chosen. For the 
above example one may choose the simple set of contrasts 

Vi = N — E — W + S, y 2 = S - W, y 8 = N-S. 

* I am indebted to Prof. E. S. Pearson for drawing my attention to this fact. 




C. Radhakrtshua Rao 


63 


The mean values and estimates of variances and covariances based on 27 degrees of 
freedom for the y’s are = 8-8571, y 2 = 4-5000, y 2 = 0-8571, 

(w tj ) = / 128-7200 61-4076 -21-0211\. 

I 61-4076 66-9269 - 28-2963 J 

\- 21-0211 -28-2963 63-6344 / 

The coefficients in the best linear function A 1 y 1 + \$ 2 + Aj y 2 are given by the equations 

128-7200Aj + 61-4076Aj — 21-021 1A 8 = 8-8671, 

61 •4076A 1 + 66-9269A a — 28-2963A 3 = 4-6000, 

- 21-021 1AJ-28-2963AJ + 63-6344A3 = 0-8671, 

Solving, one gets A x = 0-06620, A a = 0-04416, A s = 0-05174, 
so that the best contrast is 

A 1 (N-E-W + S) + A 2 (S-W) + A 8 (N-S) 

= 0-10794N — 0-05620E — 0-10035W + 0-04861S, ' 
or 1-0794N — 0-5620E— 1-0035W + 0-4861S, 

obtained by multiplying the coefficients by 10 (arbitrarily). 

The statistic for testing the hypothesis of equality of means is 

N 

1 = ftiVi + ^s-^2 + ^aVa) 


•740790) = 0-768226. 

V p _ 1 (N-p+ l)/(p-l) = 0-768226(28-4+ l)/3 = 6-4019. 

The quantity 6-4019 as the variance ratio with 3 and 26 degrees of freedom is significant 
at 1 % level so that the bark deposit cannot be considered uniform in the four directions. 

The assigned contrast is represented by y x and to test for its significance one can construct 
the statistic y yt 28(8-8571)* 


Ti = 


1 uu 


27 128-7200 


= 0-632020. 


N — l 

The quantity {N - 1)1^ = 17-0646 as the variance ratio with 1 and 27 degrees of freedom is 
highly significant. 

To test whether the assigned contrast agrees with that estimated from the data, the 
statistic U defined in (2-2) has to be calculated. 

•768226 


N-p + l fl +V P _ X \_ 25 (1-768226 \ 

p - 2 \ 1+F X ) 2 (1-632020 J 


1-0431. 


•632020 

This value of U as the variance ratio with 2 and 25 degrees of freedom is small so that the 
evidence supplied by the data is not sufficient to reject the assigned contrast as not the best, 
although the ratios of the coefficients in the estimated contrast depart considerably from 
those assigned. 

(c) Makalanobis' D i and problems of two samples 

Let N x and N 2 be the sample sizes from two populations n x and n 2 characterized by (p + q) 
variates. The sample means for the tth character are represented by x tl and x i2 for n x and n 2 
respectively. The estimated value of the covariance is given by 

Nt N, 

(N x + N 2 — 2) Wfj = 2 ( x iu ~ x n ) ( x ju ~ x ji) X- 2 (•*'<» — x a) ( x j» t ~ x ja) • 


t-i 


<-i 



64 


Tests of significance in multivariate analysis 

Mahalanobis’ (1936) distance between the two populations as estimated from the sample 
on the basis of the first p characters is 

D* = £ £«>«(% -x i2 )(x }1 -x J2 ), 

i-ij-i 

where (w$) is reciprocal to (w {i ), i,j = 1,2, ...,p. The exact distribution of D® on the hypo- 
thesis specifying real differences in mean values has been given by Bose & Roy (1938). 
To test the hypothesis specifying no difference in mean values of the p characters for n l 
and n 2 the statistic N 2 - p - l) 2 

p(N 1 + N 2 )(N 1 + N 2 -2) 1 *> 

can be used as the variance ratio with p and (N, 4 * N 2 — 1 — p) degrees of freedom. The appro- 
priate distance function based on (p + q) characters is 

D% f9 = g (x n - x i2 ) (x n - x j2 ). 

To test whether the q additional characters lead to significant differences between n 1 and n 2 , 
independently of the first p characters, a comparison may be made of the magnitudes of 
D 2 and Dp+ q . One may choose the ratio 

1 + N l N t l%+ q HN l + N t )(N l + N t -2) 

1 +XN 2 IJI!(A\ + N 2 ) (N, + Nt- 2) 

which has some theoretical advantage as observed in the next section. The conditions for 
the use of the statistic defined in (2-2) are satisfied so that 

u = N ' +N *- p . ~ !L-— (R- 1 ) 

is the variance ratio with q and (N, + N 2 — p — q — 1 ) degrees of freedom. 

It has been shown by Fisher ( 1938 ) that the test with Mahalanobis’ I) 2 is equivalent to 
testing the difference in the mean values of the discriminant function as estimated from the 
samples. Apart from tests of significance, the discriminant function is used for the purpose 
of assigning an individual to its proper class. This leads us to the problem of determining the 
number and nature of characters used in order that the errors of classification may not be 
large. It is, however, to be expected that the errors of classification will decrease as the 
number of characters is increased. On the other hand, since the various characters are 
correlated with one another the addition of some characters to a basic panel may not reduce 
or at any rate substantially decrease the errors. In this case, it may not be worth while to 
increase the number of characters so that the numerical computation involved may not be 
heavy. A test designed to judge the significance of the reduction in errors of classification 
by the addition of some new characters is valuable in problems of this nature. Such a test 
is mathematically equivaler^t to a test of differences in mean values of the new variables 
after eliminating the differences in the basic characters. If p is the number of basic characters 
and q the number of additional characters the test may also be conceived as a comparison 
of Mahalanobis’ distances based on p and (p + q) characters. 

If it is desired to test whether an assigned discriminant function r\,x x -f . . . + 1 ) p x v of 
p characters differs from that calculated from the data, then the statistic 
N 1 + N 2 ^p^in+N 1 N 2 D 2 /(N l + N 2 )(N 1 + N 2 ^ 2 ) \ 

~'p- 1 \i+N;N 2 D\I(N 1 + N 2 )(N 1 + N 2 -2) 1 

where D\ = £ 2 1 — 2 ) /£ 2 ti , can be used as the variance ratio with 

(p— 1) and (N 1 + N 2 -p-l) degrees of freedom (Fisher, 1940; Bartlett, 1939). 



C. Radhakbishna Rao 


65 


Example 2. The following Tables 2 and 3 reproduced from Fisher (1938) give the mean 
values based on fifty observations each and the covariances based on 50 + 50 — 2 degrees 
of freedom for four characters in two species of plants Iris versicolor and Iris setosa. 

The solution of the equations 

AiW h -f- • • • *4" A. 4 i# 4 ,f = d^ (i = 1, 2, 3, 4) 

is obtained as 

A x = - 3-0528, A 2 - - 18*0229, A 3 = 21-7662, A 4 = 30-8442, 
so that the discriminant function is 


- 3-0528#!— 18*0229#2 + 21-7662#3 + 30-8442# 4 . 
The value of D\ is* A^-f ... + A 4 d 4 = 103-2335. 

To test for the differences in mean values one can use the statistic 


N t N 2 (N i + N 2 - 1-4) D\ 50 x 50 x 95 


x 103-2335 


(-«r i + JV f )(JV 1 + N 2 -2) 4 100x98x4 

- ^(26-3350) = 625-4515, 

which as the variance ratio with 4 and 95 degrees of freedom is highly significant. 


Table 2. Observed mean values based on fifty observations each for the Urn species 


Character 


Sepal length 
Sepal width (x 2 ) 
Petal length (x 8 ) 
Petal width (;r 4 ) 


Iris versicolor 

Iris seiosa 

Difference (d) 

5-936 

5*006 

0*930 

2*770 

3*428 

-0*658 

4*260 

1*462 

2*798 

1*326 

! 

0*246 

1*080 


Table 3. The, pooled covariance matrix based on 98 degree s of freedom 



*1 

*2 

*3 


#1 

0*195340 

0*092200 

0*099626 

0*033055 

r 2 

0*092200 

0*121079 

0*047175 

0*025251 

#8 

0*099626 

0*047175 

0*125488 

0*039586 

*4 

0*033055 

0*025251 

1 

0*039586 

0*025106 


If only the lengths are considered (i.e. z t and # 3 ) the equations leading to the discriminant 
function are 0-15)5340//.! + 0-099626// 2 = 0-930, 

0-099626/tj + 0-1 25488// 2 = 2-798, 
so that = — 11-1088, fi t — 31-1163 and 

D\ = // 1 (0-930)+//. 2 (2-798) - 76-7322. 

To answer the question whether the widths (x 2 and x A ) supply independent information 


the significance of the ratio 


It 


1 + N r NsDlIM + N 2 ) (N t + N r - 2) 
r+ N t A r 2 /)i/(A T 1 + N 2 ) (AT a + A t 2 - 2) 
1 + 26^350 = 

1 + 19-5745 


♦ This method of calculating D 2 avoids the actual invorsion of the matrix (w if ). The A coefficients 
obtained by this process can be directly used in the construction of the discriminant function. * 

Biometrika 35 5 




66 Tests of significance in multivariate analysis 

has to be tested. The value of the statistic 

U m (JB-l)(Ai+Ai-l-4)/2 - 95(0-3286)/2 = 15-6085 
as the variance ratio with 2 and 95 degrees of freedom is significant at 1 % level. This shows 
that the widths in association with the lengths lead to further discrimination of the species, 
so that there is a significant reduction in the errors of classification. 


(d) Test for the equality of discriminant functions 

If four samples of sizes N v N % , N a and N t from populations n v n a , n a and are available 
one can test whether the discriminant functions between ir v rr t and n a , n t are significantly 
different by an extension of the test criterion discussed above. It is a necessary condition of 
the test that the variances and covariances are identical in the four populations n v n a , v a 
and i r 4 . No reasonably simple test can be constructed to establish the equivalence of the 
discriminant functions when this condition is not satisfied. 

Let (w i} ) be the dispersion matrix based on (N t + N a + N a + N a — 4) degrees of freedom. 
If d v ...,d p are the differences in mean values for n l and n 2 and d[, ...,d' p are those for n a 
and 7r 4 , the test for equality of discriminant functions is identical with the testing of the 
hypothesis E(d{ ) = E(d'f) (i = 1,2 p), 

or E(d i )^E(-d’ i ) (i = l,2,...,p). 

The variance ratios with p and n = (N x + N 2 + N B + N 4 — 3— p) degrees of freedom for the 

two cases are f(\r\ 

71 


and 


pn+p — 

M) 


n 


ySS w^(d i + d' i )(d i +d' i ) 


where 


(l J_ 1\ 

[n+n+n+nJ- 


pn+p- 

1 

m 

The equality of discriminant functions is indicated if at least one of the statistics is not 
significant. Similar tests can be constructed for judging the differences in discriminant 
functions in parallel samples from two populations or between n v n 2 and tt v n a . 


3. Generalization of D 2 and the large sample theory for several groups 

Let there be k multivariate populations n v n 2 , . . . , n k from which samples of sizes N lt N a , . . . , N k 
are available for p + q characters. The oommon covariance matrix assumed to be known or 
estimated on a large number of degrees of freedom is represented by (otffl) for the first p 
characters and by (a$5 +9> ) f° r the (p + q) characters. The inverse of (a^) is represented 
by (a^)) and that of (a^ +a) ) by (aft +g) ). Let x n , x i2 , ... be the mean values of the ith character 
in the first, second, etc., populations. , 

It has been shown by Hsu (1939a) and Rao (1945) that the statistic, 

Vp.k = S S 

where x i% = ('LN r x ir )ftLN T ), can be used as y 2 with p(k— 1) degrees of freedom to test the 
hypothesis that the mean values are the same in all the k populations for these p characters. 
The statistic V pk is a suitable generalization of the Mahalanobis’ D 2 in its classical form and 
its theoretical derivation has been discussed by the author (Rao, 1945). 



C. Radhakrishna Rao 


67 


When this test indicates differences in mean values it is, in some problems, neoessary to 
test whether the observations on q additional characters supply independent information for 
discrimination. The statistic for testing the differences in means for all the p+q characters is 

P+1 k 

Vp+ 9l k = S 2 ag +B) £ Kfar - . ) (x Jr - i X t . ), 

i,j- 1 r«l 

which can be used as a ^ with (p + q)(k — 1) degrees of freedom. The q(k — 1) additional 
degrees of freedom bring in the contribution 

V —V 

r p-H/,k r p,k> 

and the significance of this difference can be appropriately used to judge the significance of 
the information supplied by the additional characters. This difference can be used as a x 8 
with q(k — 1) degrees of freedom as shown below. 

The hypothesis that the new characters do not lead to further discrimination of the 
populations specifies that any linear function of the (p + q) characters uncorrelated with 
each of the p characters has the same mean value for all the k populations. There are q such 
linear functions and treating them as q variables a x 2 with q(k — 1 ) degrees of freedom can be 
constructed to test the above hypothesis. The above method of taking the difference is only 
an alternative way of calculating this x 2 - For V p + Qt k calculated from all the (p + q) characters, 
being invariant under linear transformations of the variables, is equal to V pk + xf calculated 
from thep original characters and the q linear functions chosen to be uncorrelated with each 
of the p characters. 

In the above derivation it has been assumed that the variances and covariances are known 
and the distributions are asymptotically correct when they are estimated on a large number 
of degrees of freedom. When more than two populations are involved the pooled estimates 
of the covariances have, usually, a sufficiently large number of degrees of freedom to validate 
the use of the asymptotic distributions. More exact tests for cases involving small numbers 
of degrees of freedom are given in the next section. 


4. Tests with Wilks’ A criterion 
(a) Analysis of dispersion and the theoretical aspects of the A criterion 

In the univariate analysis of variance tests of significance reduce to the comparison of two 
independently distributed mean squares. One of the mean squares is an unbiased estimate 
of the variance to which a single observation in any particular class is subject and is called 
the error variance. The other is only so when the null hypothesis which is being tested is 
correct and may be called the mean square due to deviation from the hypothesis. The test 
depends only on the individual degrees of freedom of two mean squares. 

When each observation consists of p mutually correlated variables there are p total sums 
of squares and p(p— l)/2 total sums of products which can be analysed into various cate- 
gories. This process which involves the technique of analysing the variances and covariances 
of multiple correlated variables may be termed the analysis of dispersion .* The term 

* Prof. R. A. Fisher suggested that this method can be described as either analysis of covariance 
or analysis of dispersion. Since the term covariance analysis is conventionally used in problems of 
adjustment for conoomitant variation, I have used the term analysis of dispersion to cover a 'wider 
variety of problems considered in this article. 


5-2 



68 


Tests of significance in multivariate analysis 

dispersion has been used by Prof. Mahalanobis to indicate the scatter of a set of observa- 
tions as measured by the variances and covariances. Following this terminology the total 
dispersion may be said to be analysed into dispersion due to various categories. 

If we represent the total sum of products by the matrix S = (Sy), then the analysis of 
dispersion consists in analysing each element such as S ij9 according to the usual procedure, 
into various categories with the corresponding distribution of degrees of freedom. The dis- 
persion due to any category supplies the sum of products (denoted for brevity by s.p.) 
matrix which on division by the degrees of freedom gives the mean product (denoted by 
m.p.) matrix. The s.p. matrix leading to unbiased estimates of the variances and covariances 
to which a single set of variables is subject is called the s.p. matrix due to error. This error 
matrix may be denoted by W with w as its degrees of freedom. In the analysis of dispersion 
the s.p. matrix due to any other category leads to unbiased estimates of variances and 
covariances only when the null hypothesis regarding that category is true. This may be 
called the s.p. matrix due to deviation from the hypothesis. If such a matrix is represented 
by Q with q as its degrees of freedom, then the problem of testing the null hypothesis consists 
in comparing the matrices w~ l W and q _1 Q. The simultaneous comparison of the estimates 
of the variances and covariances appears to be a natural extension of the comparison of 
variances in the case of a single variate. 

The appropriate test criteria for comparison may be obtained by extending the method 
of discriminant function analysis. A linear compound of the variables is taken and the com- 
pounding coefficients are chosen such that the ratio of mean squares due to deviation from 
hypothesis and due to error for this variable is a maximum. The ratio F 2 which comes out 
as a root of the determinantal equation 

Q-S-FHY =0 

w 

may be used as the appropriate test criterion. If | W | #= 0, the number of non-zero roots of 
this equation is equal to the number of variables under consideration or q , the number of 
degrees of freedom of Q , whichever is smaller. An adequate comparison of ur 1 W and q ~ 1 Q 
must involve the tests of significance of all the roots. If F\ , F\, . . . represent the various roots, 
it is easy to verify that 

The ratio | W |/| W+ Q | denoted by A decreases as the roots increase and a significantly 
small value of A may be taken as providing the significance of one or more of the roots. 
This is the underlying theory of the A criterion arrived at by Wilks (1932) by using the 
likelihood ratio method and later extended by Bartlett ( 1 934)* for general use in multivariate 
analysis. 

This, however, does not provide a satisfactory test, for when only one or a smaller number 
of roots than the total indicate real differences, their significance may bp obscured by the 
use of the overall test. Its use can be recommended only in situations where small deviations 
from the hypothesis can be ignored. 

* In a paper read before the Royal Statistical Society in May 1947 (Bartlett 1947), Dr Bartlett 
suggested a method of factorizing A, arising out of a category in the analysis of dispersion, which 
appeared different from the procedure I have outlined above. In my discussion of Bartlett’s paper, 
I pointed out the difference between his approach in some problems and the general approach by the 
method of analysis of dispersion, which alone I think can lead to unbiased tests of significance. Such 
a factorization leads to a valid test in the case g= 1 as I have shown elsewhere (Rao, 19466, p. 409, 
equation 2*14). But this is not true in general. 




W + Q | 
W I 



C. Radhakbishna Rao 


69 


(b) The distribution of A and its practical use 
The following notation will be used throughout this and the subsequent sections. 

Analysis of dispersion for p variables 


Duo to 

D.F. 

s.p. matrix 

(]) Deviation from hypothesis 

q 

Q 

(2) Error 

n — q 

: 

w 

(3) Total 

n 

W + Q 


A = | W |/| W + Q\ 


If the number of variables involved is p, then assuming that the elements of W are 
distributed independently of those of Q it is easy to derive the <th moment of A (Wilks, 1932; 
Bartlett, 1934) as *-1 r{l(n-i)} V{{{n -q- i) + t} 

i=o -*)+<}’ 

The tests based on the exact distributions given by Wilks (1932) and Nair (1939) for some 
particular cases are reproduced below. 



Nature of test 


Variance ratio 

Degrees of freedom 

1, for any p 

1 — A n —p 

A p 

p and ( 71 — p) 

2, for any p 

1 A n — p — 1 

' Va~ " " 

2 p and 2(n—p — 1) 

: 1, for any q 

1 — A n — q 

A q 

q and (n — q) 

■ 2, for any q 

J yj A n — f/ — 1 

VA q 

2 q and 2 (n—q — 1) 

• 


For other values of p and q, the exact values of the probabilities can be obtained by the 
use of the x 2 tables alone as shown below. 

It has been show r n by Wald & Brookner (1941) that the distribution of the statistic 
v = —log A is of the form , ^ n * \ 


where 


i=0 * {i («• — ? — »)) 


i n 1 

and the fl’s are the coefficients in the expansion of in powers of 1 jn. 


V(n) 


/?o + ^+^+. 

rv n n 2 



70 Tests of significance in multivariate analysis 

For the purposes of examining Bartlett’s approximation and obtaining a quickly con- 
vergent series for the tabulation of percentage points, the transformation 

F-(.-S± 

is made. Changing over to V from v the above distribution becomes 


where 


2>(m)= n 


) m H 2 s T(« + \pq)j 


- r {>(„ + E.- r J_ i )} 


It is easy to recognize that the y * s depend only on p and q and that they are the coefficients 
in the expansion of in powers of 1/m, 

(ml 

The asymptotic expansion of can be calculated from the formulae 


m + 2\fc>« D(m) 


(=^) 


D(m + 2) 


and 


y ° + ^T2 + - 

r ° + S + - 

D(m) _ p ~*m — i + (p — q + 1)/2 
D(m + 2) ~ ( io m-i + (p+q + l)/2‘ 


Taking the logarithms of the first equality one gets 

l08 ( r « + ^f2 + -)-f l0g ( ,+ s) + 1 °«(’'« + S + -) 

+ ? l0 « (‘ +£ZS ^ Z - 2 ‘) - ? l0 « (> + ~ TST^‘) • 

Expanding in powers of (1 /to) and equating coefficients of like powers on both sides, the 
first five y’s come out as 

1 ,* 


7o 


7x = + S {(p-g+l-2i)*-(p + g+l-2i) 2 } 


M l*- 1 

2 16i. 0 


= 0, 


7s = {(2» + 3 + 1 - 2i) 8 - (p - g + 1 - 2i) 8 } 

= -y+||(P 2 +« 2 + 11 ) 

= ff(P 2 + 3*-6). 

* That the value of y 0 = 1 is easily seen by making m-*oo in the distribution of V in which case it 
is reduced to the x* f orm » The comparison of the coefficients, in fact, give the values of the ratios 
7i/7o. VtlVo 


1 P - 1 



C. Radhakbishna Rao 


71 


r» - 2y a + y + pj V {(P ~ 3 + 1 - 2i) 4 - (p + q + 1 - 2 i) 4 } 

= 2y a +^-*ftp 3 (p 2 + g 2 + 3) 

= 0 , 

' 74 “ - *y a + ^ - f pq + i^q P ^ q {(3» + ? + 1 - 2i) 6 - (p - q + 1 - 2t) 5 } 
= - 4 y«+^f“f« + {3p 4 + 3g 4 + 1 10(p 2 + ? 2 ) + 10p 2 g 2 + 127} 

= Y + ~ ,{3^ + 3? 4 + 10p 2 ? 2 - 50(p 2 + g 2 ) + 150}. 


Considering only terms up to the fourth power in (1/m) the distribution of V becomes 


( 2 Up« 

Z^mJe^F *^- 1 


' 1 y 2 F 2 y 4 F 4 | 

2 ipv r(kpq) + to 2 2 2 +*^«r(2 + \pq) + m 4 2 4+ip «r(4 + |pg)j ‘ 


The probability of V exceeding an observed value V 0 , then, becomes 

where 

P pQ+8 = Probability of with pg + 5 degrees of freedom exceeding the value V Q . 

/ 2\*^ 

One may go a step further and expand I — I D(m) in powers of (1/m) in which case the 
above series becomes 


Pm + % -*„) + {Y i(Ppg+s - P„) - 7 l(P pq+ * -P pq )} + -. 

This form is most convenient for the calculation of the required probability. The quantities 
y 2 and y 4 are simple functions of p and q only. Using y 2 tables, the y’s and powers of (1/m), 
the probability of V exceeding V 0 can be calculated to a sufficient degree of accuracy. 
Bartlett (1938) suggested the use of 

V = - m log, A = ^ logsA 

as x i with pq degrees of freedom. This corresponds to using the first term of the series. Since 
the second term is o(l/m 2 ), its contribution is very small even for moderately large m so 
that, in many practical problems, Bartlett’s approximation can be safely used. For small 
values of m one may use the second and the third terms depending on the accuracy needed 
in any problem. 


(c) Test of differences in mean values for several populations 

Let 7Tj, ...,n k be k populations from which samples of sizes N v .:.,N k for p correlated 
variables are available. The dispersion has to be analysed into ‘between’ and ‘within’ 
populations. The s.p. matrix ‘ within ’ populations (the error) has N t + ... + N k — k degrees 
of freedom and that ‘between’ populations has (k — 1) degrees of freedom. If these are 



72 


Tests of significance in multivariate analysis 


represented by W and Q then the statistic to be used for testing the differences in mean 
values is y = - W log e A, 

where A=\W\/\W + Q\, 


m = n 


•p + q+l 
2 


n = N x + ... + N k —1, 
q = k- 1 . 

The exact probability of V exceeding the observed value can be calculated as explained in 
§ 4 ( 6 ). 

Example 3. Table 4 gives the analysis of dispersion for the three characters, head length 
(x t ), height (z 2 ) and weight (a; 8 ) measured on 1 40 schoolboys, of almost the same age, belonging 
to six different schools in an Indian city. 


Table 4. Analysis of dispersion 


Dispersion 

D.F. 

s.r. matrix 

** 

x\ 

x\ 


x x x 9 

x % x s 

‘Between* schools 
‘Within* schools 

5 

134 

(Qu) 752-0 

(W if ) 12809-3 

151-3 

1499-6 

1612-7 

21009-6 

214-2 

1003-7 

521-3 

2671-2 

401-2 

4123-6 

Total 

139 

(Sy) 13661-3 

1650-9 

22622-3 

1217-9 

3192-5 

4524-8 


12809-3 

214-2 

521-3 

214-2 

1499-6 

401-2 

521-3 

401-2 

21009-6 

13561-3 

1217-9 

3192-5 

1217-9 

1650-9 

4524-8 

3192-5 

4524-8 

22622-3 


_ 10 12 (0- 176005) 

“ To 1 *(0-2 13628)’ 

-log e A « 0 1 93724, 

m = 139 — 1(5 + 3+ 1) = 134-5, 

V = — m log e A = (134-5) (0-193724) 
= 26-0559. 


Using F as x 2 with pq = 15 degrees of freedom the first approximation comes out as 

P 16 = 0-0375. 

r* 


The second term is 


m 


(Pit- Pi s). 


. , 29x15 , y 2 29x 15 

where = — — — and — 


48 


m* 48(134-5)* 


0-00050096, 


r? 

m 


|(P 19 -P 16 ) = 0-00050096(0-1286- 0-0376) 
= 0-00004674. 




C. Raphakbishna Rao 


73 


This correction to the first approximation affects only the fourth decimal place so that 
correction is hardly neoessary . The observed value of V is significant at the 5 % level showing 
thereby that boys of various schools differ in physique. This appears to be generally true 
since boys belonging to different social strata attend different schools. 

(d) Internal analysis of a set of variates 

Let x v ...,# 8 , x 8+1 , • •• 9 % 8 + p be ($+i>) correlated variables for which samples of sizes 
N v ..., N k are available from k populations. If the differences in mean values of these 
variables are to be tested for significance, then the method given in §4 (c) can be used. An 
important problem that arises in biometry is to test whether the variables, say x 8 + v ..., % 8 + p 
bring out further differences in populations when the differences due to x v . . . , x H are removed. 

It is apparent in problems of this nature that some of the variables in the set x v 
might be in the nature of concomitant variates which have been observed in association with 
the dependent variables or which might have been chosen to have some specified values. 
An illustration of such an analysis is. given in my discussion on Bartlett’s paper (Bartlett, 
1947). In that problem there were three dependent variables g, h and i corresponding to 
linear, parabolic and cubic terms of growth curves of pig weights and a concomitant variable 
w giving the initial weight of pigs. It was desired to test whether the variables h and i bring 
out further differences in food treatments when the differences due to g and w are eliminated. 
The problem is identical with that posed above with </, w forming the first set and h } i the 
second set of variables. 

There is a third set of problems in which it is desired to test whether the differences in 
k groups characterized by characters can be explained by variations in s assigned 

linear functions of these variables. If y lf ..., y 8 + p are the ( 5 -fp) variables and 

W = « u yi+- + m lt p+t y p+l , 


m a,lVl + • • • + p+aVp+s' 

are the assigned linear functions, then one can replace the (s+ p) variables y v by 

x i> ...,x B+p defined by , 

•••? ~~ 

■^s+i = "Vh.i2/i+ • • ■ ?) *s+i, s+iiVs+p’ 


“^sl -p p*\V\ ••• "b 

where the coefficients in x #+1 x B+p are chosen arbitrarily subject to the condition that 

the determinant | m i; . \,i,j = 1, 2, (s+p) is not zero. This latter condition ensures that 
the transformation from the y’s to the z’s leads to one-to-one correspondence. Once again, 

the problem is reduced to that of considering the differences in x 8+1 x a+p when those due 

to x v ...,x B are removed. The proposed test is independent of the compounding coefficients 
used to define the set x B+1 , x B+p so that, in any practical problem, they may be con- 
veniently or conventionally chosen. 

In all these cases, the problem is one of analysing the dispersion of the variables 
x a+v . . . , x B+JI when the dispersion due to x v ...,x B is removed. This can be done by following 
the covariance technique suitable for p dependent variables and s independent variables 
(Wishart, 1936, 1939; Rao, 1946a). 



74 Tests of significance in multivariate analysis 

Lftt (S i} ) = (Qy) + Wtj) (*» j = 1, 2, . . + j>)) 

be the analysis of dispersion for all the (s +p) variates due to deviation from hypothesis 
and error with the corresponding distribution of degrees of freedom as 

»' = g + (»'-?). 

The s.p. matrix due to error for the variables x lt to be e limin ated is 

Ui ... 

and its inverse is represented by 



IF"/ 


*s+p 


The s.p. matrix due to error for x a+1 , . 
fF(«+l,...,s+j>| 1, ...,s) or simply W(p\s) where 

W(p\a) = 


fW%+\ a9 +\ ^£+l,a+jA 

••• ,i+pl 

... ^».»+A/ 

Wi,»+Z> ." Wn.s+p/ \ 


W 11 

w* 


1F 1S \ 

W n j 


l for x lf .. 

is given by 

w ht+1 ... 


- 

w 1 


This form which involves the evaluation of a triple product of matrices appears to be most 
convenient for computation as illustrated in the next section. Replacing W by S one has 
the formula f6r computing the s.p. matrix due to ‘deviation from hypothesis + error’ for 

x t+1 , . . . , x^ v when corrected for x x x e . If this is represented by S(p \ s) then the required 

criterion is | W(p \ a) 1 

I S(P | «) r 

The degrees of freedom for W{p\s) are ( n' — q — 8 ) and for S(p \ s) are (n'—s) so that in 
standard notation the parameters associated with A are 


n = n —s, p = p, 

The test can be carried cut as discussed in § 4 (6). 


q = q. 


(e) Barnard’s problem of secular variations in skull characters 

The problem of measuring secular variations in skull characters considered by Barnard 
(1935) is of immense importance to the anthropologists. It is, however, of interest to examine 
the methods employed by her in the light of latest developments in multivariate analysis. 
The two problems involved in her study are ' 

(i) the selection of a smaller number, out of seven skull characters, which give significant 
information, so far as is possible, as to changes taking place with time in four series of Egyp- 
tian skulls, and 

(ii) the determination of an expression, linear in measurements, which characterizes 
most effectively an individual skull with respect to the progressive secular ohanges. 

To answer problem (i) Barnard first chose Basi-alveolar Length and Nasal Height as two 
basic characters which independently of each other, show significant variation in the four 



0. Radhakbishna Rao 


75 


series. To choose further characters she considered the problem of testing the significance 
of the linear regression of the mean values of an added character with time (corre- 
sponding to the four series) when that part of the regression due to the two basic characters 
is removed. This meant the choice of characters with special reference to the average linear 
rate of change of the individual means with time. If the choice of characters is to be with 
reference to the complete nature of changes taking place with time, then what is needed is 
an internal analysis of the characters to decide whether the configuration of the four series as 
determined by several characters is the same as that indicated by a smaller number. Bar- 
nard’s method should, of course, be preferred if the regressions were known to be linear. 
This can, however, be tested from the data. 

Taking the four measurements 

a: 1 Basi-alveolar Length, 
x t Nasal Height, 
x 3 Maximum Breadth, 
x t Basi-bregmatic Height, 

the relevant data are summarized in Tables 5 and 6 which give the means for the four series 
and the analysis of dispersion. 


Table 5. Means for the four series 


Character 

Series 

I 

N t = 91 

n 

= 162 

III 

N t = 70 

IV 

IV 4 = 76 


133*582418 

134*265432 

134*371429 

136*306667 

*» 

98*307692 

95*462963 

95*857143 

95*040000 


60*835165 

51*148148 

60*100000 

52*093333 

*4 

133-000000 

i 

134*882716 

133*642857 

J 

131*466667 


Table 6. Analysis of dispersion 



• 

Dispersion 


‘Between’ (3 d . f .) 

‘Within’ (394 d . f .) 

Total (397 d . f .) 


123*180628 

9661*997470 

9785*178098 

x\ 

486*345863 

9073*115027 

9559*460890 

x\ 

100*411505 

3938*320351 

4088*731856 

. x i 

640*733891 

8741*508829 

9382*242720 

*i*i 

— 231*375635 

445*573301 

214*197666 

x x x t 

87-305348 

1130*623900 

1217*929248 

x x x 4 

-128*763994 

2148*584210 

2019*820216 

X t X 9 

- 107*505618 

1239*221990 

1131*716372 

Xfpc 4 

125*313318 

2255*812722 

2381*126040 

X 9 X 4 

-137*580764 

1271*054662 

1133*473898 




76 


Tests of significance in multivariate analysis 


Example 4. Do the characters x 3 and x t show significant variation in the four series 
independently of the variation due to the characters x 1 and x 8 ? 

The method developed in § 4 (d) is directly useful in this problem. The s.p. matrix ‘ within ’ 
for the basic characters and x 2 is 


/W n 1T 12 \ = /9601-99747O 445-573301\. 

Wi Wj \ 445-573301 9073-115027/ 


Its inverse is 

(W 11 

W 12 \ = 10*/ 

1-037332 

— 0-050942\ 


\ir 21 

W 22 J \ 

-0-050942 

1-104059/ 


The ‘within’ s.p. matrix for x a , x 4 due to x v x a is given by the triple product 


(W 13 

w 2 a 

(W 11 

H’ 12 \ 

(Wis 

WrA 

ln; 4 

WJ 

\W 721 

W 22 J 

1^23 

wj 


/1 130-023900 1239-221990\ 1W 11 FP 4 \ tW l3 JF U \ 

\2148-584210 2255-812722/ \W 21 If 22 / \W^ 8 Wj 


/ 1109-703904 

131 1 -321492\ 



\21 13-879535 

2382-450025/ 

1^3 

wj 


287-907020 534-238790\ 
534-238790 991-021041/ 


The ‘within’ s.p. matrix for x 3 and x 4 after correcting for x 1 and x 3 is 



wa\-{w 13 

W£ 3 \ 

/ W 11 

W™\ 

(W 13 

W u \ 

1^34 

wj IWi* 

wj 

\H^ 21 

W 22 j 

w*3 

wj 


/3938-320351 1271-O54002\ - /287-90702O 534-238796\ 

\ 1271-054002 8741-508829/ \534-238790 991-021041/ 

/305O-353731 730-815800\ = W(2 | 2). 

{ 730-815800 7 749-887788/ 


This has 394—2 = 392 degrees of freedom. Similarly $(2 j 2) with 397 — 2 = 395 degrees 


of freedom is 


A = 


/3809-335190 
\ 011-798381 
| W(2 | 2) | 0-27740934 

‘| 6'(2 | 2) | “ 0-31000332 


011-798381\ , 
8393-755848/ 

= 0-878058, 


V = 7K log e A, 


V+q+l 2+3+1 

m = - = 395 — — — = 392, 


V = - 392 log e (0-878058) = 51-39. 


This value of V on pq = 0 degrees of freedom is highly significant so that x 3 and x 4 may be 
considered as discriminating the series independently of x x and x 3 . 

Example 5. Taking the relative times between the series in the proportion 2:1:2, can the 
variation of the characters be accounted for by the linear regression of individual characters 
with time? 

In order to obtain the regression with time the values of t, the time variable, may be taken 
as —5, —1, +1 and +5 for the individuals of the first, second, third and fourth series 
respectively. The calculation of individual regressions involves the quantities 

2(<-f) 2 = 4307-00832, 

Sx x (*-f) = 718-70280, Ex a (<— f) = —410-10194, 

2x a (t-i) = - 1407-20075, Sx 4 (i-f) = - 733-42758. 



C. Radhakbishna Rao 77 

The matrix R with 1 degree of freedom giving the squares and products due to regression 
is given in Table 7. 

Table 7. Matrix R with 1 degree of freedom 



— 

X X 



*4 

1 

1 19*930358 

— $34*810812 

68*428235 

-122*377258 

fi 

— 234*810812 

459*734449 

-133-975163 

- 149*601596 

3 

68*428235 

-133*975163 

39*042852 

- 69*824358 

4 

-122*377258 

-149-601596 

i 

- 69*824358 

124*874099 


In the above table 

R n = [Zx t (t -4)] 2 /S(* -?) 2 , R 12 = [Zx x (t -I)] [S z t (t -im(t -f) 2 , 
and so on. With these results one may analyse the dispersion of which a typical product 
{x l x 2 ) is chosen below for illustration. 


Table 8. Analysis of dispersion 


Due to 

D.F. 

s.p. matrix 

Kogrossion 

1 

-234*810812 (R„) 

♦Deviation from regression 

2 

3-435177 (Qtj) 

Total (‘between* series) 

3 

-231-375635 (fl (r +0 w ) 

‘Within’ series 

394 

445-573301 (W^) 

Total 

397 

214-197666 (S it ) 

Deviation from regression + ‘ within ’ 

396 

449-008478 (Q tl +\V it ) 


* This quantity is obtained by subtraction. The complete matrix (Qh + IT<y) obtained by the above 
method is given in table 9. 


Table 9. Matrix (Q fj -f W ti ) until 396 degrees of freedom 


H 

t 

x i 

*3 

*4 

9665*247740 

449*008478 

1149*501013 

2142-197474 

449*008478 

9099*726441 

1265*691535 

2231*524444 

. 

1149*501013 

1265*691535 

4049-689004 

1203-298256 

2142-197474 

2231*524444 

1203*298256 

9257*368621 


To test the hypothesis that the regressions are linear one has to compare W and Q+JF. 

* \W\ 0 -24269054 xlO* 2 3fi 

A |#+M r | 0-26873816 x 10 12 

, r 2 + 4+1), 

/A AA(li\n 


V = - 396- 


,(0-90307436) 









78 Tests of significance in multivariate analysis 

The x 2 approximation has pxq =* 2x4 = 8 degrees of freedom, since Q has 2 degrees of 
freedom and there are four variables. The result is significant so that the regressions cannot 
be considered linear. 

This test can be extended to examine whether a parabolic regression with time can explain 
the differences in mean values. The matrix Q giving the deviation from regression has then 
1 degree of freedom and JR due to regression 2. 

To determine the coefficients of a linear compound which characterizes most effectively 
the secular changes in progress, Barnard maximized the ratio of the square of unweighted 
regression of the compound with time. It is doubtful, as Bartlett (1947) points out, whether 
such a linear compound can be used to specify an individual skull most effectively with 
respect to progressive changes, since linear regression with time does not adequately explain 
all the differences in the four series. 

The variance ratios obtained in this paper in the case of two samples can also be 
derived from a general regression analysis by considering certain pseudo-variates, which 
have constant values for members of the same sample, as dependent variables and the 
observed values as independent variables (Bartlett, 1939; Fisher, 1940; Brown, 1947). 
Such an approach does not seem to be possible in the case of a single sample. On the 
other hand, the statistics defined in (2-1) and (2*2) can be used in any situation where 
there are a number of linear hypotheses to be tested with the use of the estimated 
deviations and their independently estimated dispersion matrix. The distribution of the 
statistic in (2*2) has been derived by the author (Rao, 1 946 b) under these general conditions. 

I wish to thank Dr Wishart for his helpful criticism during the preparation of this paper. 


REFERENCES 

Barnard, M. M. (1935). The secular variation of skull characters in four series of Egyptian skulls. 
Ann . Eugen ., Lond 7, 89. 

Bartlett, M. S. (1934). The vector representation of a samplo. Proc. Camb . Phil . Soc. 30, 327. 
Bartlett, M. S. (1938). Further aspects of the theory of multiple regression. Proc . Camb . Phil. Soc. 
34, 33. 

Bartlett, M. S. (1939). The standard errors of discriminant function coefficients. J. Roy . Statist. 
Soc. Suppl. 6, 169. 

Bartlett, M. S. (1947). Multivariate analysis. J. Roy . Statist. Soc . Suppl. 9, 76. 

Bose, R. C. & Roy, S. N. (1938). The distribution of the studentizod D 2 -statistic, Sankhya, 4, 19. 
Brown, G. W. (1947). Discriminant functions. Ann. Math. Statist. 18, 514. 

Fairfield Smith (1936). A discriminant function for plant selection. Ann. Eugen., Lond., 7, 240. 
Fisher, R. A. (1936). The use of multiply measurements in taxonomic problems. Ann. Eugen., Lond., 
7, 179. 

Fisher, R. A. (1938). The statistical utilization of multiple measurements. Ann. Eugen., Lond., 8, 376. 
Fisher, R. A. (1939). The sampling distribution of some statistics obtained from nonlinear regression. 
Ann. Eugen., Lond., 9, 238. 

Fisher, R. A. (1940). The precision of the discriminant function. Ann. Eugen., % Lond., 10, 422. 
Hotelling, H. (1931). The generalization of Student's ratio. Ann. Math. Statist. 2, 360. 

Hotelling, H. (1936). The relation between two sets of variates. Biometrika, 28, 321. 

Hotelung, H. (1947). A generalized T measure of multivariate dispersion. Ann. Math. Statist. 
18, 298 (Abstraet 2). 

Hsu, P. (1939a). On the generalized analysis of variance. Biometrika, 31, 221. 

Hsu, P. (19396). On the distribution of the roots of certain determinantal equations. Ann. Eugen., 
Lond., 9, 250. 

Mahalanobis, P. C. (1930). On the generalized distance in statistics. Proc . Nat. Inst. Set. (India), 
12 , 49. 



C. Radhakeishna Rao 79 

Martin, E. A. (1935). A study of the Egyptian series of mandibles, with special reference to mathe- 
matical methods of sexing. Biometrika , 28, 149. 

jJair, U. 8. (1939). The application of moment functions in the study of distribution laws in statistics. 
Biometrika , 80, 274. 

Neyman, J. A Pearson, E. S. (1928). On the use and interpretation of certain test criteria for purposes 
of statistical inference. Biometrika, 20 A, 175. 

Neyman, J. A Pearson, E. S. (1931). On the problem of k samples. Bull. int. Acad . Cracovie, A, p. 460. 

Pearson, E. S. A Nbyman, J. (1930). On the problem of two samples. Bull, int . Acad. Cracovie , 
A, p. 73. 

Plaokett, R. L. (1947). An exact test for the equality of variances. Biometrika , 34, 311. 

Rao, C. R. (1945). Generalization of Markoff’s theorem and tests of linear hypothesis. Sankhyd, 7, 9. 

Rao, C. R. (1946a). On the linear combination of observations and the general theory of least squares. 
Sankhyd , 7, 237. 

Rao, C. R. (19466). Tests with discriminant functions in multivariate analysis. Sankhyd , 7, 407. 

Rao, C. R. (1947). The problem of classification and distance between two populations. Letter to 
Nature , 159, 30. 

Roy, S. N. (1939). p -statistics or some generalizations in analysis of variance appropriate to multi- 
variate problems. Sankhyd , 4, 381. 

Wald, A. A Brookner, R. J. (1941). On the distribution of Wilks’ statistic for testing independence 
of several groups of variables. Ann. Math. Statist. 12, 137. 

Welch, B. L. (1939). Note on discriminant functions. Biometrika , 31, 218. 

Wilks, S. S. (1932). Certain generalizations in the analysis of variance. Biometrika , 24, 471. 

Wilks, S. S. (1935). On the independence of k sets of normally distributed statistical variables. 
Econometrica , 3, 309. 

Wishart, J. (1928). The generalized product moment distribution in samples from a normal multi- 
variate population. Biometrika , 20 A, 32. 

Wishart, J. (1936). Tests of significance in analysis of covariance. J. Boy. Statist. Soc. Suppl. 3, 79. 

Wishart, J. (1939). Statistical treatment of animal experiments. J. Roy. Statist. Soc. Suppl. 6, 1. 



[ 80 ] 


ALTERNATIVE SYSTEMS IN THE ANALYSIS OF VARIANCE 

By N. L. JOHNSON 

1. It is the purpose of this paper to compare the fundamental theoretical set-ups implied 
by certain well-known systems of approach to the analysis of variance. Differences in 
interpretation in the different systems are discussed, and attention is drawn to some par- 
ticularly simple results in the theory associated with one of the systems. No attempt is 
made to place the systems in any order of general preference. It is the author’s opinion that 
each has its own sphere of application, while consideration of problems from the viewpoints 
of more than one of the systems will often prove enlightening. 

2. The power functions of tests used in the analysis of variance have been considered by 
Hsu (1941), and by Tang (1938) who has given tables by means of which the power may be 
evaluated numerically in certain cases. 

In the particular case of testing for differences between k groups, the theoretical set-up 
used by these authors is 

x ij = B i + z ij (i = 1 lc\j = 1, (1) 

where x tj corresponds to the jth observation in the ith group. B t is a constant representing 
the expected value in the ith group and the s are independent random variables, each 


with zero expected value and standard deviation <r. 


The appropriate analysis 

Source 

of variance is: 

Sum of squares 

Degrees of 
freedom 

Moan square 

Between groups 

k 

2 ni(x { ' — x tm ) 2 

irml 

k- 1 

1 

fc- 1 i-i 

Within groups 

k m 

S £ (*«-*<.)* 

N-k 

1 km 

— - £ £(*„-*,.)* 

N-k 

k 

where N = £ n t is the total number of observations; x t — 
i - 1 k 

m 

v x u is the mean observed 
i 


value in the ith group; x = i\ 7 1 2 is the mean of all the observed values. The expected 

i-i ] k 

value of the between groups mean square is cr 2 + — - nABi - B) 2 . The expected value of 
the within groups mean square is or 2 . ^5 = N _1 £ B t . j The ratio 

(between groups mean square)/(within groups mean square) 

is used to test whether there is any difference between the B/&. If there is no such difference 
the expected values of the numerator and denominator of the above ratio are each equal 
to or 2 ; otherwise the expected value of the numerator is greater than or 2 . 

If it be assumed that each of the s is normally distributed, the ratio of the mean squares 
will be referred to the F distribution with degrees of freedom v x = k — 1 , v 2 = N — k. Large 
observed values of the ratio are regarded as significant. A suitable upper significance limit 
of the appropriate F distribution may be used as a formal critical limit for the mean square 
ratio. Tang showed that, for such a test with alternative hypotheses specified by (1) above, 
the power (i.e. the chance of establishing significance when the s are not all equal) depends 



N. L. Johnson 


81 


on k, N and, say, 6 ■= (As — B)*/o* only. The power function is somewhat 

{" 1 

complicated in its mathematical expression, but Tang’s tables make it possible to determine 
the chance that a given ratio 0 would be established as significant at either the 5 or 1 % level. 

3. The alternative approach described below involves a modification in the theoretical 
set-up and leads to a very simple form of power function in a particular, but common, case. 
Although the modification may hot always be justifiable, it will often provide a more 
accurate model than set-up (1), apart from the greater simplicity in the resulting analysis. 

This alternative form of theoretical set-up is constructed by replacing (1) by 

x i} = A + z' i + z ij , (2) 

where A is a constant and the z'j s are independent random variables, each with expected 
value zero and standard deviation cr'. The Z+ s and the z { /& are also mutually independent. 
In (2) the constants B i of ( 1 ) are replaced by the random variables A + z\, and the hypothesis 
B x = B 2 = . . . = B k is replaced by the hypothesis cr' = 0. 

It is evident that (2) is a suitable set-up if it is possible to regard the groups as being chosen 
at random # from a large assemblage of groups. If the groups are fixed, (1) will be preferable. 
Sub-sampling by batches from a randomly selected sample of batches is a typical example 
where (2) is suitable; comparison of a number of distinct treatments of a material is a typical 
example where (1) is preferable. Set-ups (1) and (2) represent extreme cases. In general it 
is likely that an intermediate set-up of form 

= B { + z\ 4- z u 

would be most appropriate. Daniels (1939) and Eisenhart (1947) have discussed, in some 
detail, the factors affecting the relevance of set-ups (1) and (2) in any particular problem. 

However, the same analysis of variance is suitable in both extreme cases and so is likely 
to bo suitable for any intermediate case. In fact, under the assumptions summarized in (2): 

The expected value of the between groups mean square is <r 2 + <r' 2 ^N 2 — £ n\jjN(lc — 1). 

The expected value of the within groups mean square is a 2 . The ratio 

(between groups mean square)/(within groups mean square) 
is again suitable for testing the hypothesis cr' = 0. If it be assumed that the z t -/s are normally 
distributed, the F significance limits may be used as described in § 2. 

4. If it now be assumed that the zj’s, as well as the z { / s, are normally distributed, it is 
possible to obtain the power function in the case n x = n 2 = ... = %. = n, say, in a particularly 
simple form. From (2) it follows that 

= A + z[ + z i where 


*<. 


Zi = n 


n 

i-l 


= A 4- % where u i = z-4- z L . 

The Ui s are independent normal variables each with expected value zero and standard 
deviation ^/(ir^a^ + Gr' 8 ). The between groups sum of squares is 


k k 

2 ^ — x ) 2 — ^nfUi — u) 2 

i- 1 “ t-1 


where u = Ar 1 £ u i} 
<= i 


and hence is distributed as ^ a (cr 2 4 nor' 2 ) with degrees of freedom v = k — 1 . The within groups 
sum of squares is k n k » 


<— i j— i 


i-lJ-l 


Biometrika 35 


6 



82 


Alternative systems in the analysis of variance 


and is distributed independently as ^*cr* with v t = k(n — 1) degrees of freedom. Hence the 
mean squaite ratio used in the analysis of variance test is distributed as 



with iq = k— 1, v z = k(n— 1). 

If F a , the upper 100a % point of the F distribution, be used as a formal critical limit, the 
probability of rejection of the hypothesis <r' = 0 is 


Pr j _ Pr .J f> ^, + (3) 

This is the power of the test with respect to the alternative hypothesis specified by (2) 
and a particular (non-zero) value of cr'. When or' = 0 the probability of rejection is a. As <x' 
increases 1^(1 -f-rwr' 2 /o ,2 )~ 1 decreases and the power increases. The expression (3), considered 
as a function of cr', is the power function of the analysis of varianoe test in this case. 

It will be noted that the power function is a function of ncr' 2 /^. This is analogous to 

* — 

the fact that with set-up (1) the power function is a function of ( n/(k — — B) 2 l<r 2 . 

* i-i 

<r' a and 2 (-#< — B) 2 /(k — 1), indeed, fulfil similar roles in the two systems. 

i“l 

5. The calculation of the power function from (3) is straightforward. Since 

(k - 1 )«*-» [k(n - !)]«*-« 

B(\{k-\],\k[n-\-]) [k(n-l) + (k-l)F]to> k -V’ 

it follows that Pr. {F > F a ( 1 + rwr^/cr 2 )" 1 } can be expressed as the incomplete beta function 

rati0 $(**[*- 1 ],*[*- 1 ]), 

where <j> = k{n - 1 )l{k(n — 1 ) + (k — 1 ) F a ( 1 + rwr^/cr 2 ) -1 }. 

When the number of groups is odd and not large it is possible to evaluate (3) as a simple 
explicit function /?((x' ; 'cr) of cr’/cr. For example if k — 3, 




where 

X.-[ !(n-l)]« 3 ”- s >. 

Hence 

/?(or'/<r) = I*" v {F)dF 

J F a (l+n(r'*J(r t )- x 


= * ;[f(» - 1) + jyi + ™r' 3 /er 3 )-i]-«<»-«, 

where 

^; = A' n /f( W -l) = [f(«-l)]*< 3 »- 7 ). 

Since 

f p(F)dF = a, 

J F a 


K[ |(n-l) + F a ]-«-» = a, 

and so 

P{<r /<r) a _ 1) + j p a(1+ nar ' ij(r 2 ')_i J 


( 4 ) 


(4) bis 


6. In many cases the hypothesis B x = B 2 — ... — B k , or the corresponding hypothesis 

in set-up (2), <r' = 0, is unduly stringent. That is to say, a test procedure is required which 

will lead to acceptance in a high proportion of cases provided there is not too much difference 
between the groups. It is natural to use as measure of the amount of difference between the 



N. L. Johnson 


83 


groups a parameter on which the power function depends. The parameter to be used would 

k 

therefore be 2 — 5) 2 /(fc— 1) in conjunction with set-up (1), or <r' with set-up (2). The 

1 

critical limit for the mean square ratio should then be chosen so that the chance of rejection 
is less than a v say, for any value of the parameter less than some specified amount. 

It will also be desirable that rejection shall take place in a high proportion of cases — at 
least 1 — a 2 , say — when the groups differ by more than a certain amount. That is, the test 
should have at least a certain minimum sensitivity. 

The requirements described above may be summed up formally as follows: 

(а) The power function of the test must be less than a x for all values of the difference 
parameter less than some known amount. 

(б) The power function must be greater than 1 - a 2 for all values of the difference para- 
meter greater than a second known amount. 

This type of problem occurs in the theory of statistical quality control, the limiting values 
of the parameter corresponding to ‘Producer’s Tolerance’ or ‘Process Average* and ‘Con- 
sumer’s Tolerance’ or ‘Lot Tolerance* respectively. 

Evidently, conditions (a) and ( b ) cannot both be satisfied unless the data available are 
sufficiently numerous. The amount of data required can be found from a study of the appro- 
priate power functions. Tang’s tables are useful in this connexion when set-up (1) is suitable. 
The analysis is much simpler for set-up (2) in the case we have considered in § 3 et seq. In the 
next section certain interesting results will be derived for this particular situation. 

7. It will be convenient to define the difference between the groups by the parameter 
a' jcr, instead of simply cr'. It may be noted that cr'/or is the ratio of a parameter representing 
between-groups variability to one representing within-groups variability, and so can be 
regarded as a measure of relative between-groups variability. We shall require the analysis 
of variance test to be such that 


(a) the probability of rejection is less than ocj if cr' jcr < A x ; 

(b) the probability of rejection is greater than 1 — a 2 if cr'/a > A 2 . 

In the limiting case when (a) is just satisfied, it is clear that the critical limit F 0 must 


be such that 


Pr.{^(l-fnAf)>^ 0 } = a 1 . 


Hence F 0 — F aj (1 +??Af), F ai being the upper lOOoq % point of the F distribution with 
v x = k— 1, i> 2 = k(n — 1). 

If condition (6) is also to be satisfied we must have 

Pr, {F( 1 -f nX 1) > F 0 } > 1 - a 2 , 

i.e. Pr. { F( 1 -f- nAf) > F ai ( 1 + nAf )} > 1 - a 2 . 

This means that (5) 

^i-a a 1 J rnX\ 

^i-a* being the lower 100a 2 % point of the F distribution with v x = k— 1, r 2 = k(n — 1). 
The choice of X v A 2 , a x and a 2 will depend on practical requirements. Once these are decided 
upon, the number of groups, k , and the number of samples per group, w, should be chosen 
so that (5) is satisfied, if possible. 

Imagine k fixed, and consider the effect of increasing n. As n increases the ratio F ai /F 1-ai 
decreases steadily, approaching the limit lxl~a» as n approaches infinity. (xl t and 
represent the upper lOOaq % and lower I00a 2 % points respectively of the x 2 distribution 

6-2 



84 


Alternative systems in the analysis of variance 

with (k— 1) degrees of freedom.) As » increases the ratio (1 4- nA|)/(l + »A|) increases steadily 
from 1 to (Aj/Ax)*. If #5, lx !_«, is less than (Ag/Ai) 8 there will be a number n 0 such that if 
n> » 0 condition (5) is satisfied. If Ixl-*, is greater than (Ag/Aj) 8 , on the other hand, it 
will not be possible to satisfy (5), however large n may be. lx\-a t is a decreasing function 

of k, approaching 1 as k tends to infinity. There will therefore be a minimum number of 
groups, k 0 , below which it is impossible to satisfy (5). A short table of such minimum values 
is given below. 


<x x = a % 

= 005 

*i = «i 

= 0-01 

Aj/Ai 

K 

A«/Ai 

*0 

1*5 

35 

1-5 

68 

2 

14 

2 

25 

2-5 

9 

2-5 

16 

3 

7 

3 

12 


It may be noted that if it is reasonable to assume that a is constant, A 2 /A, is the ratio of 
the ‘ unacceptable ’ to the ‘ acceptable ’ limit of between-groups variability. 

8. The alternative systems developed in §§ 2 and 3 are of use in the interpretation of the 
analysis of variance for data arranged according to a cross-classification. We shall consider 
only the simple.case of a two-way cross-classification into l ‘rows’ and m ‘columns’, there 
being n observations in each cell. The symbol x ijt will be used to denote the <th observation 
in the cell belonging to the ith row and the jth column. The analysis of variance table is: 


Source 

Sum of squares 

Degrees of 
freedom 

Between rows 

i 

mn 2 (z { —x )* 

Z-l 

Between columns 

t-l 

m 

In £ (x 4 —x )* 

m— 1 

Interaction 

l m 

ftl 2 (Xij —Xi — x j +x ) z 

(1—1) (m — 1) 

Within cells 

l m n 

S 2 2 (x ijt -*„.)* 

Imn — 1 


i-=i n-i 

» m n In 

where %. = x .j. = £ £ x w> 

l m n 

x — (Imn ) 1 2 2 'L x ut- 

The question now arises — should the ratio of the between-rows mean square to the within- 
cells mean square be used to test for differences between rows, or should the ratio to the 
interaction mean square be used ? (A similar question arises, of course, in the analysis between 
columns.) The answer to this question depends on which of the two set-ups (6) and (7), 
shown below is the more appropriate. 

First, extending (1), we have 


x ijt — A+R i + C j + I ij + Zi ft . 


( 6 ) 




N. L. Johnson 


85 


As before the z i)t ’ s are independent random variables each with expected value zero and 
standard deviation <r. The parameter A is introduced to simplify the mathematics and 
represents the overall average level of the character measured. R t represents the average 
departure from this level in the ith row; C } represents the average departure in the jth 
column. Without loss of generality it may be assumed that 

l m 

Ei?i = EC, = o. 

<-i i-i 

I {j represents the interaction, or departure from the linear set-up, in the cell belonging to 
the ith row and the jth column. Without loss of generality it may be assumed that 

l m 

i» i i 

The expected values of the various mean squares in the analysis of variance table are, 
under the conditions summarized by (6): 

Mean square Expected value 

i 

(i) Between rows cr* + ran E R 2 /(l — 1 ) 

i*»l 

m 

(ii) Between columns cr* + In E Cj/(m — 1 ) 

>** 1 
l m 

(iii) Interaction or 2 + n E E P if l(l — 1 ) (ra — 1 ) 

(iv) Within cells cr* 

It is clear, therefore, that the ratio of (i) to (iv) should be used to test the hypothesis 
a R 2 as ... = R l = 0 when set-up (6) is valid. As in the case of set-up (1), Tang’s tables 
can be used to calculate the power of the test to establish significance when there is a true 
effect. If the interaction mean square were used instead of the within-cells mean square we 
should risk reaching an inconclusive result when significance could have been established 
(i.e. there will be an increase in the second kind of error). 

Now consider the alternative set-up 

x m ~ A + Hi + Cj + Zy + %ijt, (7) 

formed by replacing the parameters 1 by the random variables z\y The z\j s are assumed to 
be independent of each other and of the z ijt ' s, each having expected value zero and standard 
deviation <r'. Under these conditions the expected values of the mean squares in the analysis 
of variance table are: 

Mean square Expected value 

l 

(i) Between rows <r a + rur'* + ran E Bf/(l — 1 ) 

i-1 

vn 

(ii) Between columns cr 2 + ncr' 1 + In E Cjj(m — 1 ) 

J- 1 

(iii) Inter€bction cr 2 + ncr' 9 

(iv) Within cells cr 9 

It is apparent that if the ratio (i)/(iv) be used to test the hypothesis R x = = . . . = R t = 0 

when set-up (7) is valid, the numerator will tend to be increased by the cr' a effect without any 
corresponding increase in the denominator. There would thus be a bias towards rejection of 
the null hypothesis (i.e. there will be an inorease in the first kind of error). In this case, 
therefore, the ratio (i)/(iii) would be a preferable criterion. 



86 


Alternative systems in the analysis of variance 

In the case of a two-way classification, therefore, unlike the case of k groups, the two forms 
of theoretical set-up lead to different procedures in the analysis of variance. As before, the 
two systems, in this case (6) and (7), may be regarded as extreme cases. An intermediate 
set-up of the form x . j( = A + R { + C) + I M + z'y + z ijt 

would possibly be a truer reflexion of the practical position. This being so, it may be as well 
to consider both the ratio with the within-cells mean square and that with the interaction 
mean square in the denominator. Since the biases introduced by using the inexact tests are 
in opposite directions it follows that if the verdicts given by both ratios agree, confidence 
may be placed in the joint decision. Otherwise, closer consideration must be given to the 
conditions of collection of the data, to decide whether (6) or (7) is the more appropriate set-up. 

Considerations similar to those developed above may be applied to more complicated 
problems in the analysis of variance. As a general rule, if an interaction is represented by 
a random variable, it is necessary to take it into account when testing interactions of lower 
order involving the same factors. Otherwise it is not necessary to do so. 

9. Randomization theory provides yet a third system of theoretical set-ups. The essential 
difference between randomization theory and the systems already described is clearly 
illustrated by comparison of the theoretical set-ups they imply in the case of simple classi- 
fication by groups. 

It is possible to apply randomization theory to such a situation only if it is reasonable to 
suppose that any individual in the sample could have occurred in any one of the groups. 
A typical case arises when the effects of a number of treatments are to be investigated. The 
experimental material is divided into a number of groups and each group is assigned to a 
particular treatment. 

It is supposed that the experimental arrangements are such that each possible arrangement 
of the N individuals into k groups of size n v n 2 , . . . , n k respectively is equally likely. The 
process of randomization in selecting the groups is an attempt to make practice consistent 
with the theory in this respect. The null hypothesis may then be expressed: ‘the observed 
value of the character measured will be the same for any one individual, whatever be the 
group in which it is placed.’ On the null hypothesis, therefore, there are only N possible 
observed values u v w 2 , Any observed set of values ^ u , ..• 9 x knk is simply a rearrange- 
ment of the u r ’ s. Since x it is equally likely to have any of the N values ?/ 2 , the 

theoretical set-up on the null hypothesis may be written 

% = ViP ( 8 ) 

where is a discontinuous random variable with distribution function 

(r = 1, 2, 

It will be noted that all the s have the same distribution, but they are not independent. 
If Vij = u r , then, in general, no other of the s can take the value u r . 

It has been shown (Fisher, 1935; Welch, 1937, 1 938 ; Pitman, 1 937) thkt in the simple cases 
of classification by groups and cross-classification considered in this paper the distribution of 
the mean square ratio on the randomization theory may be replaced by the normal theory 
F distribution with little fear of serious error. In these cases, therefore, randomization of an 
experiment ensures that critical limits based on normal theory shall be approximately valid. 

_ N 

If we denote u = N - 1 %u r by A, (8) may be written 

r«l 

X n - a + y'ip 



N. L. Johnson 


87 


where y' i} = y {j — A. The y ' i} ’ s are discontinuous, dependent random variables, each hiving 
the same distribution with expected value zero. Logically, alternative hypotheses, corre- 
sponding to the existence of differences between the groups, could be introduced according 
with either of the systems described earlier in this paper. Symbolically these would be 
expressed . 

x i} = A + B i + y[j, (9) 

x u = A +z' i +y' ij . (10) 

(9) corresponds to set-up (1), and (10) to (2), so far as the representation of group differences 
is concerned. 

Set-up (1) may be written in the form 

X {j = A+Bi + Zy. (11) 

(B { has been replaced by A + B if but this is merely a matter of convenience, and does not 
affect the nature of the set-up.) We also recall that set-up (2) is 

x i) ~ A+z'i + z^. ( 12 ) 

Studying set-ups (9) to ( 1 2) we notice that while the set-ups ( 1 ) and (2) differ in the nature 
of between -groups variation which they specify, randomization theory implies a modification 
in the distribution of the individual random residual variation, replacing the independent 
z i} ’ s by the dependent y'i/s. 

Although four different set-ups are shown above, it seems that (10) would be used but 
rarely. The use of constant parameters to represent group effects appears to be more 
consistent with the ideas of randomization theory. 

REFERENCES 

Daniels, H. E. (1939). J. Roy. Statist. Soc . Supj)l. 6, 180. 

Eisenhakt, C. (1947). Biometrics, 3, 1. 

Fisher, R. A. (1935). The Design of Experiments. London: Oliver and Boyd. 

Hsu, P. L. (1941). Biometrika, 32, 02. 

Pitman, E. J. G. (1937). Biometrika , 29, 322. 

Tang, P. 0. (1938). Statist. Res. Mem. 2, 120. 

Welch, B. L. (1937). Biometrika , 29, 21. 

Welch, B. L. (1938). Biometrika , 30, 149. 



[ 88 ] 


AN EXAMINATION AND FURTHER DEVELOPMENT OF A FORMULA 
ARISING IN THE PROBLEM OF COMPARING TWO MEAN VALUES 


By ALICE A. ASPIN, B.A., Leeds University 


1. Introduction 


In a recent paper B. L. Welch (1947) has developed, by a formal process, an expansion applic- 
able to the problem of comparing two mean values. It is the purpose of the present paper 
(a) to extend this expansion to some further terms, (6) to investigate the numerical behaviour 
of the expansion in some particular oases and (c) to consider the comparative merits of 
a rearranged form of the expansion. 

If we have two normal populations with true means a x and a a respectively and true 
variances cr\ and <r|; and if we have samples of sizes n x and n 2 drawn respectively from these 
populations, yielding sample means x x and x a and sample variances af and af; then (x x — x a ) 
is distributed normally about (a a — a g ) with variance (Ox/»i + u'|/n a ), and af and a| are dis- 
tributed as Xi <T il( n i~ 1) au( i Xl ar il( n 2 ~ 1) respectively. Problems of some importance are 
either to assess the significance of the observed difference (x x — x a ), or to calculate as a func- 
tion of (xj — x 2 ), s f and a§, limits within which the population difference (a a — a a ) may be said 
to lie with a given probability. In his theoretical discussion of this problem Welch has found 
it convenient to consider it as a particular case of the following more general problem which, 
formally, is no more difficult to solve. 

Suppose i/ is any population parameter, estimated by an observed quantity y which is 

k 

normally distributed with variance = 2 A . Suppose that, in addition, the data provide 


estimates af of the unknown variances <rf (i = 1,2,..., k), based on f ( degrees of freedom, 
“ d “ “ 


where the s\ (i = 1,2, ...,1c) are all statistically independent of each other and of y. Let 
h( 8 \, *£, . . . , P) be that function of 8 2 and P such that the probability is P that (y — ?j) falls 
short of h(s\, 8 %,P), Then, if we can find &(*{,#!,...,#{, P) in general, the particular 
application to the case of the comparison of means will follow by setting k = 2, rj = (a x — a 2 ), 
y = (iUj — ^ 2 )) = 1 /^i &nd Ag — 1 /^ 2 * 

Writing, for convenience, h( 8 2 ) for A($f, ^P)» Welch has shown that the integral 
equation which A(* 2 ) must satisfy can be expressed symbolically in the form 


6/ {v(2A ( !r?)) ~ P ' ( (2) 

The notation used here is that I(v) stands for the integral, from — oo to v, of the unit normal 
probability function and / 9 *at \n 

6 = jj 


exp [s + $2 ~~ - 1 - etc.] 


( 3 ) 



Alice A. Asfin 


where 8J implies repeated differentiation with respect to w t and subsequent equation of 
all Wj to <r*. 

When the / ( are all large, the solution of (2) is KM = £ where £ is the normal 

deviate suoh that /(£) — P. More generally let us write 

h(w) = h 0 (w) + h 1 (w) + h i (w) + etc., (4) 

where h t {w) includes terms of order 1 //<, KM terms of order 1 //f , and so on. Further, suppose 
that we have calculated terms up to and including the order l//< and wish to obtain an extra 
term h r+1 (u>). To the required order we may then write 

j[ h ( W L. \ = + + + 

_ rf A o( tt, )+ --+Mw )\ , K +iM I. f KM \ . /r\ 

T ' IVKoi) 'rv(SVf) )/• 1 ' 

When we operate on this with 0 the second term will, to the order involved, need to be treated 
only with the unit part of (3). Hence (2) will give 

e/l- m-m- m 


l Vl^Vi) J T V (EVJ)*'*' 

This equation gives h r+1 (cr 2 ) explicitly. Since h r+l (w) is the same function of w i and K+iM 
the same function of sf as A r+1 (tr 2 ) is of <rf , we have therefore an explicit method of deriving 
successive terms in an expansion for h(s 2 ). 

Pursuing the symbolic method a stage further we may expand i( -^— 


Pursuing the symbolic method a stage further we may expand /{ — — TrrT'~z\ — } 

[ ^(LA^) ) 

formally in a Taylor series, thus 

where IF denotes repeated differentiation with respect to v and subsequent equation of 
v to £. The operation 0 in (6) may then be regarded as acting first on the exponential on the 
right-hand side of (7), producing an expression involving D which must in turn be operated 
on I(v ). Following out this procedure, Welch (1947, p. 31) has obtained the following 

C/AUiVOOiUllO CCO A Cl* A CM3 WI AAA/A A fj i . / / 4\ 1 

KM = £ V(SV?)> KM - £V(2M) ' (1 + ^ 


4 (SA,*?)* ’ 


KM = W(£V!) 


(i +F) \ ft) 

2 (ZA i sj) 2 


If we introduce the notation 





( s m r 

(3 + 5£* + £ 4 ) 

r n 1 

(15 + 32£* + 9£ 4 ) 

r /< ) 

h 3 

(ZAM 3 

32 

(ZAisW. 

(zM) 



(8) 

7 _\ fi u 1 
(SA MiY 



(9) 


equations (8) may be more compactly written as 

KM = UVW), ^ = i(i+S*)F n , 

^ = [ - i(l + S 2 ) F M + i(3 + 5£* + £<) V M - *(15 + 32£* + 9£ 4 ) V 2 tl \. 


( 10 ) 



90 


The problem of comparing two mean values 


2. The development op further terms 

To find A 3 (a 2 ) we must first extend equation (3) to include terms of order 1//?. This gives 

This expression must operate on the exponential 

c . n r ■[*•<»> + • • • + km .a ,/i r ( km a D i cxo r + *&>)) 

exp Lr — £ rJ " exp LW(^4) *rJ exp L( vpv» t J 


(ii) 


(w) KM 

VKo-'i) 


KM KM D 2 i Km i> 3 H 

+ 6(£A^f)«/J- 


(12) 


KM an d KM are already known and can be substituted in here. When this is done the 
successive differentiations required by (11) can be carried out and the first term on the left- 
hand side of (6) evaluated for the case (r + 1 ) = 3. This equation then gives h 3 (<r 2 ). The algebra 
involved is heavy and the details will not be given here. The eventual result is found to be 

j^j = [(l+£ 2 ) F 23 - 2(3 + 5? + ?)F 33 + J(15 + 32? + 9?) F 22 F 21 

+ i(7 5 +1 73£ 2 + 63£ 4 + 5£ 6 ) F 43 - *( 1 05 + 298£ 2 4- 1 40£ 4 + 1 5£ 6 ) V 32 V 2i 
+ *4* (945 + 31 69S 2 + 181 1£ 4 + 243£ 6 ) F£j. (13) 

The labour involved in calculating groups of terms of successive orders increases rapidly 
with r. This is due to the increasing number of differential operations introduced with each 
new term in the expansion of 0 and also to the rapidly increasing complexity of the expres- 
sion on which 0 operates. Hence, even for (r + 1) = 3, a very large number of differentiations 
with respect to the w i have to be carried out before the final equation of all w j to a). In ex- 
tending the work still further to the terms of order 1 Iff , the present author found it necessary 
to cover over 100 pages with algebra, details of which need not now be given. The eventual 
result is found to be 


^==[-2(l + ?)F 24 + V(3 + 5?+?)F 34 

- 1(15 + 32? + mFnVu + TO 

- 1(75 + 1 73? + 63? + 5?) F 44 

+ 1( 105 + 298? + 1 40? + 15?) {£F 22 F 32 + F ai F 33 } 

+ i(15 + 33?+ll? + ?)t$ 4 
+ £(735 + 2 1 70? + 1 1 26? + 1 68? + 7? ) F M 

- ^(945 + 31 69? +1811? + 243?) F 42 F| x 

- T V(945 + 3354? + 2166? + 425? + 25?) F§ 2 
-^(4726 + 16586?+ 10514?+ 1974?+ 105?)F 21 F 43 
+ ^(10396 + 42429? + 31938? + 7335? + 495?)F 32 Fix 

— tAj(! 351 3§ + 626144? + 542026?+ 145320?+ 1 1583?) Fjj]. (14) 



Alios A. Aspin 


91 


3. Checks 

As so much heavy algebra has been involved in reaching these results, greater confidence 
will be placed in them if some independent checks are available. One possibility is to try to 
find an expansion of some function of h(s 2 ) rather than of h(s 2 ) itself. It happens, indeed, 
that the square k 2 (s 2 ) can be developed in terms of successive orders in 1 //< by a similar method 
to that used above without involving quite as much labour. In this development the 0 of 
equation (2), instead of operating on a normal probability integral, operates on the integral 
of the distribution of the square of a unit normal deviate. The first approximation to h 2 (s 2 ) is 
£ 2 (EA, f a^) so that, analogously to equation (12), we find that 0 has to operate on the product 

j(EA { w { ) jj £ 2£) J gj nce this does not contain a 


[II 


of two factors of which the first is exp, 

LUSAiO-f) 

square-root sign, some of the labour of operating with 0 is lightened. A full check of 
equation (13) using this alternative method has been carried out. 

A full independent check of the terms of order 1 //* has not been similarly obtained, but 
a check which is almost as satisfactory is obtained by putting k = 1 . We then find that (14) 

reduces to fc 4 (.,*) ( - 945 - 1 920£ 2 + 1 482£* + 776£ 6 + 79£ 8 ) 


h 0 (s 2 ) 


360 x 4 *f* 


(15) 


This agrees, as it should, with the term in 1 If* in the expansion of the straightforward 
‘Student’ deviate given by R. A. Fisher (1941, p. 151). It is difficult to imagine algebraical 
mistakes which could have been made in reaching ( 1 4) without invalidating this agreement 
in the particular case, k = 1 . 

4. A REARRANGEMENT OF THE EXPANSION 

It will be seen on inspection of equations (10), (13) and (14) that ^(s 2 ), h 3 (s 2 ), h 3 (s 2 ) and 
h A (s 2 ) all contain a term having (1 + £?) as a factor. Grouping these terms together, the total 
contribution to h(s 2 ) from this source is 

HW(SA^)} (1 + £*) {V n - 2F m + 4F*. - 8F M .. .}. 

Further, we have 

(SA i 6?) 2 {F 21 -2F M + 4F 23 -8F 24 ...} = (z^-2(E^) + 4(2^) 

-m 

Hence (16) gives a contribution to h(s 2 ) which we may denote by hfa 2 ), where 


(16) 


8 ( E f) 


(17) 


hi(s 2 ) = sV(£A t .«?) 


(1 + £ a ) \ /< + 2 / 


4 (SATs?) 2 

Other terms may be combined in a similar manner. Indeed, if we write 

/, A“ +1 sf“ +1) \ 


(18) 


W u = 


(/< + 2) (fj + 4) . . . (fa + 2u)J 
(EA^)** 


(19) 



92 The problem of comparing two mean values 

we can rearrange the expansion of h(s*) in the form 

h(s*) - h^s*) + h [ («*) + + etc. , (20) 

where now 

*<«•)- gV(SV?>. Hi, 

- [i(3 + 5 ? + ?) W i -&(15 + 32£® + 9£<) W \], 

+ 173g« + 63£* + 8g») W s 

° -^(106 + 298 ^+ 140 ^+ 15 ^)^ 1^+^(945 + 3169 ^+ 1811 ^ + 243 ^)^]. ( 21 ) 

The hfa 2 ) contribution does not come out completely in terms of the Wf s, but involves other 
expressions in addition. While this contribution cannot on this account be said to be anoma- 
lous, the best way of expressing it is not obvious and it is not proposed to enter into a 
discussion of it here. 

Going only as far as terms of order 1 Iff it will be seen that, by using the W u ’ s, we have 
reduced the total number of terms in h(s 2 ) from eleven to seven. Moreover, the saving in the 
number of these terms will be more marked as we proceed to higher orders of 1 Iff . In certain 
circumstances, therefore, particularly if we are computing several probability levels simul- 
taneously, it seems that there might be some gain in using equations (21). However, the 
question of convergence must be considered before a statement of this kind can properly 
be made. 

5. Numerical investigation 


This will be confined to the case k = 2 which includes, in particular, the problem of com- 
paring two mean values. It will further be assumed that /i = / 2 = /, as would happen, for 
instance, in the mean value problem if the samples drawn from the two populations were 
equal. (The common /would then be one less than the common sample size.) We are not, 
however, assuming anything about the relative sizes of the unknown population variances 


&[ and <r|. 

We shall confine ourselves here to a single probability level P = 95 %, so that £ = 1*64485. 
Let us then write 


(SM* 2 )) (SW)) 

ti _ i i V—l / zjt _ i , \r-»l / 

a i - A + TTk = * + — 77* 

vH wH 


( 22 ) 


so that and H' f are the successive approximations to according as 

the original expansion or the rearranged form is used. The H’s depend on the observed 
data only through the ratio s\]s\ or, equivalently, the ratio A 1 sf/(A 1 «J + A 8 «|). Numerical 
values in Table 1 are given against this latter ratio for / = 6, 12 and 18 . The argument of 
Ai«5/(A 1 s* + A,*!) is by tenths from 1-0 to 05 (the values 04 to 0-0 follow by symmetry and 
are not therefore shown). 

Considering first the expansion in powers of 1 jf 4 , the degree of accuracy obtainable is seen 
from a comparison of H a and 27 4 . When A 1 sf/(A 1 sJ + A g «J) equals 1 (or 0) we appear to have 
four decimal places accurate when / = 6, and five deoimal places when / = 12 or more. In 
this extreme case A(s a )/£^(S A^sJ) is a simple ‘Student’ deviate divided by £. We already 
know, of course, that the series representation of such a quantity converges rapidly (e.g. 



Alice A. Aspin 


93 


Fisher, 1941). In the final column of our Table 1 we give for comparison the values of t/£ 
calculated from Mrs M. Merrington’s table of the percentage points of ‘Student’s* t 
(Merrington, 1942). 

In the remainder of Table 1 we have no similar independent source against which we 
can make any checks and can only be guided by the relative sizes of the H } . It will be seen 
that, as we move towards the middle of the range of A 1 sf/(A 1 #| + A 2 s|), the difference 
between H a and H i tends to increase. As a result it appears that the accuracy available at 
A 1 «f/(A 1 «J + AjjS|) equal to $ is two decimal places for /= 6, three decimals for /= 12 
and four decimals for/= 18. The accuracy improves towards each end of the range of 
-f- A 2 ^§). 


Table 1. Successive approximations to A(s*)/^^/(SA < s?) 
(k = 2, /, =/,=/, P = 95 %, £ = 1-04486) 


V? 




/ = 

6 




A, -f- A* *2 













h 4 

Hi 

Hi 

77J 

hie 

1*0 

11544 

1*1 784 

1-1812 

1*1814 

1*1158 

1*1334 

1*1266 

1*1814 

0-9 

1-1260 

1* J478 

1-1524 

1*1531 

1*0949 

1*1125 

| 1*1101 

— 

0-8 

1-1050 j 

| J *1 170 

1-1217 

1*1248 

1*0787 ! 

1*0920 

1*0930 

— 

07 

1 0890 | 

j 10924 

1-0917 

1*0934 

1*0072 | 

1*0765 

1*0771 

— 

0*6 

1-0803 1 

1 1 0701 

1-0698 

1*0608 

1*0002 

1*0059 

1*0656 

— 

0-5 

1-0772 

- 1 

| 10703 

i 

1-0016 

1*0559 

1*0579 

1*0623 

| 1*0015 

— 


''i*? 




f= 

12 














H t 

H„ 


H{ 

Hi 

Hi 

8 

hie 

10 

1*07720 

1*08319 

1*08354 

1*08355 

1*06617 

1*07496 

1*07600 ! 

1*0836 

0*9 

1*06330 

1*06861 

1*06919 

1*06923 

1*05426 

1*06221 , 

1*06356 j 

— 

0*8 

1 05249 

i 1*05564 

1*05617 

1*05636 

1*04500 

1*05111 

1*05242 

— 

0*7 

1*04478 i 

1*04552 

1*04544 

1*04555 

1*03838 

1*04253 

1*04339 

— 

0*6 

1*04014 ! 

1*03908 

1*03829 

1*03810 

1*03441 

1*03713 

1*03747 

— 

0*5 

1*03859 

1*03687 

1*03578 

1*03542 

1*03308 

j 

1*03528 

1*03541 

— 


Ai«| 




/= 

18 




Aj + A|#j| 

Hi 

H, 

H, 

H t 

Hi 

Hi 

Hi 

hie 

1*0 

1*05147 

1*05413 

1 *05423 

1*05423 

1*04632 

1*05130 

1*05219 

1*0542 

0*9 

1*04220 

1*04456 

1*04473 

i 1*04474 

1*03799 

1*04238 

1*04324 

— 

0*8 

1*03500 

1*03640 

1*03655 

1*03859 

1*03150 | 

1*03485 

1*03557 

— 

0*7 

1*02985 

1*03018 

1*03015 

1*03017 

1*02687 j 

1*02915 

1*02901 

— 

0*6 

1*02676 

1*02629 

1*02600 

1*02602 

1*02409 j 

1*02561 

1*02584 

— 

0*5 

1*02573 

1*02497 

1*02465 

1*02458 

1*02316 ! 

i 

1*02440 

1*02451 

— 




94 The problem of comparing two mean values 

Turning now to the rearranged form of the expansion we notice immediately that at the 
beginning of the table H' s does not compare favourably with H a , since the value of H a is not 
close to </£. 

As we leave this end of the table the numerical differences between H' t and H 3 decrease. 
It appears moreover, from an examination of the relative sizes of H[, H' % and H a , that in the 
vicinity of the value of A 1 «f/(A 1 af+ A a a|) equal to the rearranged expansion converges 
more quiokly than the original one. There does not, however, seem to be a sufficiently strong 
case for systematically abandoning the original expansion everywhere in favour of the 
rearranged form, even although the latter involves fewer terms. 

The present numerical investigation is a preliminary one and other possible rearrangements 
of the expansion are being considered before any general suggestions for computing tables 
of h(s 2 ) are finally made. It will, perhaps, best summarize the work to the point reached if 


Table 2. Values of h(s 2 )/J( A x sf + A 2 .s|) and. equivalent F 
(k. = 2; P = 95 %) 



M* 6 * 8 )/\/(^i*i + A 2 «a) for 

Equivalent F for 

Aj -f- 

/= 6 

/= 12 

/= 18 

/=« 

/= 12 

/= 18 

1-0 

1*9432 

1*7823 

1*7341 

6 

12 

18 

0-9 

1-897 

1*7587 

1*7184 

6*9 

14*3 

21*7 

0-8 

1*85 

1*738 

1*7050 

8*3 

17*4 

26*3 

0-7 

1*80 

1*720 

1*6945 

11 

21 

32 

0*6 

1*76 

1*708 

1*6877 

15 

25 

37 

0*6 

1*74 

1*703 

1*6853 

17 

27 

39 


we present again compactly in Table 2 the final result, // 4 . for all the cases considered. A slight 
modification may be made at this stage, by multiplying through by £ so that the quantity 
tabled is now A(« 2 )/^/(2A 1 «?). The quantities in the first line of the table are then the * Student ’ 
deviates for / = 0, 12 and 18. Only as many decimal places are given as appear to be justified 
by the behaviour of the successive terms in the expansion. 

Although A(# 2 )/ N /(SA i #f) cannot, beyond the first line of the table, be derived by noting 
that some quantity follows a simple ‘Student’ distribution, it is nevertheless of interest to 
consider what degrees of freedom F, say, a ‘ Student ’ deviate would have to possess if it 
were to have the same percentage points as the values of A(» 2 )/ N /(SA f sf) given. Accordingly 
we have shown these equivalent P’s in the last three columns. These values of F enable one, 
perhaps, to appreciate better the trend of the figures given in the earlier columns. 


6. Application to the comparison of two means 

The quantity h(s 2 ) was defined in the first section to satisfy the relation 

Vr.[(y- V )<h(8*)] = P, (23) 

h{8 % ) is, of course, a function of P as well as of s \ , «§, . . . , and should, perhaps, have been 
denoted by A(« 2 , P), but h(s 2 ) was used throughout to save a little trouble in writing. The 



Alice A. Aspin 95 


dependence on P is understood. Turning now to the particular case of comparing mean 
values we must, as has already been noted, set 


y = {x 1 -x 2 ), ij = {a 1 -a t ), A x = -J-, A, = 1 


n,_ 

where n x and n 2 are the respective sample sizes. Then (23) becomes 

Pr . [{fo - x 2 ) -(a 1 -a 2 )}< h(s 2 )j = P. 

If we write v = — — — 


( n i ~ l )> «/*2 (^2 


/(-+-) 

V W n 2 / 


(24) 


(25) 

(26) 


(25) becomes l’ r - [»■ < V( X^X,, S j] ~ R (2,) 

The numerator of v is normally distributed about zero with variance {(r\jn x + n*|/n 2 ). The quan- 
tity under the square-root in the denominator of v is an unbiased estimate of (erf 4- cr|/w- a ) 
since s\ and si arc respectively unbiased estimates of o\ and rr|. The ratio v is, in general, 
to be distinguished from the quantity 


u — 


{(^i-a-2)-K-«a)} 


J ( 


(%- l)sf + (« 2 - 


(Wj + Tlij-2) 


\n x n 2 J 


(28) 


which would be the appropriate one to use if it could be assumed that = cr 2 = <r. The 
ratio u would then be referred to the /-distribution with (ftj-Mig— 2) degrees of freedom. 
If u is referred to the /-distribution when, in fact cr x =# rr 2 we are liable to be led into error, as 
has been shown by Welch (1938). These errors are not serious if the sample sizes are equal 
( n x = n 2 = n) for then u and v are the same quantity. Such error as there is is then due to 
referring u (or r) to the ‘Student’ distribution with 2 f = 2 (n— 1) degrees of freedom when, 
in fact, as we have seen, the appropriate percentage point depends to some extent on the 
observed ratio Sj/s 2 . The critical value of v will be read off from a table like Table 2, entering 
with / = (n — 1) and with the ratio of s\jn to (s\jn + s\h\). 

To take a particular example, suppose that n x = n 2 — 7 and quantities x v x 2 , s 1 and s 2 
are observed such that 


^-^ = 3-7, V (*?/ 7 + *!/ 7 )= I2 > 


4/T+4 n 


= 0-64. 


(29) 


Then entering Table 2 with / = 6 and the ratio 0-64, we have 

- 1-??. 

If now we wished to test whether the data were consistent with the hypothesis that a, = a 2 , 
we could compare the numerical value of the ratio 3*7/l*2 = 3-1 with the significance level 
1*77; this, it exceeds greatly. On the other hand, if we wished to compute limits within which 
(a x — a a ) lies with given probability, we would have from (27) 

Pr. [{3*7 — (a, — a 8 )}/l’2 < 1*77] = 0-95. (30) 

Whence, on rearrangement of the inequality 

Pr. [(«!- a 2 ) > 3-7 - 2U 2 = 1-58] = 0-95. 


(31) 



96 


The problem of comparing two mean values 

Owing to symmetry, the corresponding value of A(«*)/ > /(A 1 *? + A t «§) for P 
minus 1-77. Thus analogously to (30) 

Pr. | 1 3,7 — K— <*»>) < _ i. 77 J = o-05, 

whence Pr. [(04 — a 2 ) >3*7 + 2*12 = 5*82] = 0*05. 

Combining this with (31) 

Pr. [1*58 <(a 1 -a a )< 5*82] = 0*96-0*05 

= 0*90. 


5 % will be 


(32) 

(33) 


(34) 


In this equation the limits calculated for (a x — a 2 ) are not limits given to us before the samples 
are drawn, but depend on the observed sample statistics. The probability statement has the 
meaning usually attached to such statements when the method of inverse probability is not 
being used. At*the earlier stage in equation (30) the same holds true. The numbers 3*7, 1*2 
and 1*77 entering into this equation are all functions of the sample statistics and the prob- 
ability statement is strictly about these functions and not about connexions between three 
pure numbers regarded as fixed and known to us before the samples were drawn. In principle 
the present procedure is exactly the same as that already familiar in those problems where 
the ‘Student’ ^-distribution is applicable. The only difference in detail is that, in a ‘Student’ 
problem, the number 1*77 in (30) would be replaced by a tabular ‘Student’ deviate not 
dependent on the sample statistics observed. In the problem being considered in the present 
paper, we obtained the figure 1*77 by entering a table one of whose arguments was 
Ai^f/(Ai«f + A 2 5|)» 80 that the value obtained depended, although not very critically, on the 
observed ratio s x /s 2 . But ^he principle is really unaltered and the inversion of the inequality 
(30) proceeds in precisely the same manner as when we are considering a problem soluble 
directly by the straightforward ‘Student’ distribution. 

In the present example and in the numerical work of the previous section we have confined 
ourselves to the case k = 2 and then to f x ~ f 2 ~ /• In the problem of comparing two means 
it is when f x + f 2 that the need for calculations of the kind considered here is most apparent 
for then, as Welch (1938) has pointed out, the w.and v of equations (26) and (28) are not the 
same criterion and the error involved in the usual reference of u to the t -distribution can 
be more pronounced. We have considered the f x = / 2 case here first, merely to simplify 
calculations in an initial numerical investigation. It is proposed in some future work to 
discuss numerically the case where f x 4 = / 2 and to compare the merits of different ways in 
which final tables may then be presented. Some consideration will also be given to the case 
where more than two population variances and their corresponding estimates are involved, 
i.e. the general case k > 2 . 


REFERENCES < 

Fishkb, R. A. (1941). The asymptotic approach to Behrens’ integral, with further tables for the d test 
of significance. Ann. Eugen., Lond., 11, 141-72. 

Mbrrxngton, Maxine (1942). Table of percentage points of the 2 -distribution. Biometrika , 32, 311. 

Welch, B. L. (1938). The significance of the difference between two means when the population 
variances are unequal. Biometrika, 29, 350-62. 

Welch, B. L. (1947). The generalization of ‘Student’s’ problem when several different population 
variances are involved. Biometrika, 34, 28-35. 



[ 97 ] 


ON THE POWER FUNCTION OF THE LONGEST RUN AS A TEST 
FOR RANDOMNESS IN A SEQUENCE OF ALTERNATIVES 

By G. BATEMAN, University College , London 
1. Introduction 

It has been suggested that the distribution of the longest run in a sequence of alternatives 
might be used with advantage in quality control work and allied subjects. Hosteller (1941) 
considered the case of runs above and below the median for a sample of even size and derived 
a formula for the probability of getting at least one run of a given length or greater when the 
number of elements of each of the two kinds is the same. Thus, if the alternatives in the 
sequence are E and E, he considered explicitly only the case of 2n elements, n of which are 
E and n of which are E. It is the purpose of this paper to deal with the slightly more general 
case of unequal numbers of elements of the two kinds and to consider (i) the distribution of 
the longest run under the hypothesis of randomness, and (ii) the power function when the 
alternative hypothesis is that of positive dependence in the sequence both for the simple 
Markoff chain and when the structure of dependence is more complex. In (ii) the conditional 
power function technique, as given by David (1947) for the distribution of groups, is used, 
and the two criteria, length of longest run and number of groups, are compared with respect 
to the same alternative hypotheses. 

2. Distribution of the longest run 

It will be assumed that there is a sequence of r elements, r x of which are E and r 2 of which 
are E, where r x + r 2 — r. /7 0 , the hypothesis to be tested, will be that the elements of the 
sequence are in a random order, and the criterion used to carry out the test will be the length 
of the longest run of either E’a or E’ s. The total number of sequences which can be formed 
from the r elements is r C rx ; this is the fundamental probability set. In order to pick out from 
this set the sub-set of sequences containing at least one greatest run of a given length, say g 9 
it will be necessary to consider the partitions of r x and of r 2 having k as the greatest part, 
where k ~ 1 , 2 , .... g, and to find the number of ways in which they can be combined to form 
a sequence with at least one part equal to g and no part greater than g. This may be achieved 
most simply, perhaps, by considering the different ways in which such partitions of r x and 
r 2 form 2 1 or 2t+ 1 groups, where t = 1, 2, ...,r x — 1 for r x ^r 2 . There will be no loss of 

generality in assuming r x ^ r 2 , and it will be understood in what follows that r x ^ r 2 . 

Let /*(£, k ), where i = 1,2, denote the number of compositions* of r i elements into t parts 
of which the greatest part contains k elements. This can be expressed in terms of binomial 
coefficients by using the result that the number of compositions of r i into t parts, none of 
which exceeds s in magnitude, is the coefficient of x r * in the expansion of (r + r 2 + ... +a?)*, 

( \ _ %*\t t 

1 . This coefficient is £ ( - and expressing it in terms of the 

l-Tj ,„0 

above notation we have 2 = j; (1) 

* Compositions of a number are merely partitions of a number in which the order is taken into 
account. 

Biometrika 35 


7 





Power function of the longest-run test 


CO <M 
^ ^ 00 


pH 00 0 * 


“ J 2 2 S 

i«h c 5 


1 w © h o 

CO 05 H 


pH © r- ' 
CO o < 


OQOOO 

w^eoo 

H CO ^ 


© o co © 

pH CO U5 CO 


WOH 00 

H^»C^ 


«sg§ 


' © © © © ^ 
©CO ww 
H w w 


CO 00 © © CO 
toeoH© 
<N ^ ^ 


CN ^ © © ph © © © CO • ©OOC4CI 
H«WH pH ©CO © © © tr« 


c* oo © ph r- 

HhhHh 
H N H 


N 05N © • CM CN ^ © © GN © © © © C* 
pH Pi(N(N H ^ © CO PH 


- 2 SS 88 S 


(N 05 N © • (N W rff © © * IN © O Q © l> 

i-H h N N .-h ^ © CO 


(N 05 <N © 


CM <M ^ © CD 
HNN 


N 05 N © 


© © © © CM 
H ^ N H © 
H WN 


H©©©eq <M © © © <N 'h* CO 00*© © l> © 

pH © © CO © pH © O* © © CN Hh»N»H 

H co ^ NHjtr^OS (NMN> 


HNCOHjl© pH (N CO ^ © H (N CO ^ © CO H (N CO ^ © CO 


© 00 f- © © © © 00 © pH © © © *«• © © pH © © 00 


284 




Table 1. {continued) 

(The probabilities are obtained by dividing the number tabled by the corresponding value of r G rt ) 


G. Batsman 


99 


*“H © © © 
© © 


1 «—( © © 
<M ^ 

CM t- 


hNn 
^ © 
CO 


8 00 ^ 
©* © to 
HlOHMt 


► © © 
* N H 
I© © © 
*-H CM 


<M © ^ 


<M i© ^ 
Hlt5® 
»-M JC^ ^ 
©1 


- I-H OO © 00 © 00 

^ 00 © i-< © 

CM © © © 


’ © 00 © © 00 00 
QO l> (N H 

(N Tfl W 


> 

CO © © © © 

«o © © ©• 


> © © © f-H © © 

© I>* ^ ^ 

eo t> © oo 


• © © r** cm ^ 

© 


© © © 
^ © © ^ 
H |> H N 
CM rt< 


© © © CM © ^ 
Ol t> 00 © CO w 

T* © >-< © P-H 

CO © © © 

Hweo 


© «■* © © © © 
i— i © r- © © 


• © © © © ^ © 
<— < © ao cm © r- 


~-t © © © CM © 00 

© © ^ ^ © 
oo © ^ © r- 
00 © CO © 


CM 00 © © © CM ’ 
— < © © © . 


*— I *— « ^ © © t— «-H 
CM © t- © 

>—* CM <— « 


© © © © CO CM © 
© © © <M © © 
*-H © © © © *1 

cm © i- r- 


cm © © © © r- 

^ © © 


CM 00 © © © CM QO 
— « © © © ^ 


© © © © ^ © CO 
© »-* © CO © t-^ 
© © <M © © 
-h WWW 


CM CM t* © © 
-h CM CM 


CM © © © © L'- 
*-« ^ © CO 


t- © © © © © 

CM 00 © © © <30 00 
»-h © CM t- © © 


CM © CM © 


NWM'©® 

h W N 


Nt-^©©(NNit 

cm ^ cm © © cm 

t- oo © co 


<M-^<M©©(M^(M 

Nrt oowacq^ 

*-h CM CO CM 


<N^H^©©!^©© 
CM 00 ^ ^ © 

i— i <M i-H 


CM 00 © < 
»-H © < 


> © CM 00 

* © ^ 


CM © © © © t~ 
^ © CO 


CM CM T* © © 
i-h CM CM 


CM © 05 © 


t>(h(NWW 

®ggg« 

*«h CM CO CO 


© © © © CO © © 
-H © © © © © CO 
— < «* 00 © © "if 
h « © © 


©©©©"*©©© 
CM©^^©©<MI- 
t-H 00 © C , » © © 
h^UOOOI>« 5 
Hrtt^w 


»*H CM CO ^ © © t"» HWCO^IO©^ I— JCMC0^©©r-© 


00 CM —t © OS 00 t" ^ CO <M © © 00 OSOOtr~©©^COCM 


7-n 


11 i 9 167960 .i.i.}. . 10 90 456 1736 : 6464 | 14784 ! 34030 58406 47640 | 5344 

10 I 10 184756 .;.!.i. . i . 20 ! 208 1140 4464 | 13180 ! 34670 64058 58290 8194 



100 


Power function of the longest-run test 


It follows immediately that 

j-1 

If N(2t,g\ rfa) denotes the number of sequences of 2t groups when at least one group 
contains g elements and no group contains more than g elements, then 


N(2t, g | r x r 2 ) = 2!" f x (t, g) 2 f 2 (t, k) +f a (t, g) 2 f 2 (t, fc)l . 

L k<g fc<0- 1 J 


( 2 ) 


The factor 2 is introduced to allow for the fact that the sequenoe may begin with E or with E. 
In the same way it may be seen that 

N(2t+l,g\r 1 r 2 ) = [f 1 (t + l,g) ^f 2 (t,k)+f 2 (t,g) 2 /i(<+ 1»&)1 
L k<g k<0 - 1 J 

+ r/i( | -J)S/.«+ 1 > i )+/,((+4) S AM)1 (3) 

* L *<0-1 J 


As it will be required later, <f>{t v t 2 , g) will be used for the expression 

fi(h, g) 2 /a(<a> k) +/a(<a> 9 ) S fi(h » *) provided that | - < 2 1 < 1 • 

k<o 

Thus N(2t,g\r 1 r 2 ) = 2<f>(l,t,g) and N(2t+ \,g \ r x r 2 ) = <f>(t + \,t,g)+(f>(t,t+ \,g). 

The enumeration of the required subset is completed by summing N(2t,g\r 1 r 2 ) and 
N(2t + 1 , g | r x r 2 ) over all groups, i.e. from t = 1 to t = r t — g + 1 . If the number in this subset 
be denoted by N(g | r x r 2 ), then 

r,— 0+1 

N(g\rjT 2 )= 2 (ty(t,t,g)+<f>(t+l,t,g)+<j>{t,t+l,g)). (4) 

<-1 


Hence in a sequence of r elements, r x of which are E and r 2 of which are E where r, 4 r 2 = r 
and r x > r 2 , the probability that the longest run consists of g elements is 




N(g | r x r 2 ) 

’ 


and the complete probability distribution of the length of the longest run, given r x and r 2 , is 
obtained by letting g take all possible values. Values of the function (4) for r = 10 to 15, and 
r — 20 are given in Table 1 ; values of the functions (2) and (3) for r t = 14, r 2 = 6, illustrating 
the form of the correlation between T, the number of groups, and g, are given in Table 2. 

If, as is frequently the case in statistical applications, we require only the probability of 
the longest run having a length greater than or equal to a given value, say g<j, it will be con- 
siderably simpler to calculate 2 ^a> 9) directly; for using result (1 ) it follows that 

0 > 0 . 

2 <Hh,t t ,9) « r ‘-U— /’-n-i- ft ( 2 for K-1,1 < 1 

0 >0. i- 1 \i-0 / 

.= 0 for I <1 — <al > 1, 

and for brevity we shall write <p{t x , t 2 , g > g 0 ) for this expression. 

The probability that the length of the greatest run, g, is greater than or equal to g 0 is 
immediate, for 

p {9 > 9o | ^* 1 ^* 2 } = E p {9\ r i r a} 

0 - 0 * 

1 rri-0.+l 

TqT t 2 ( 2 #, t,g>g 0 )+<p(t + l,t,g> g 0 ) + <f>{t, t+l,g>g 0 )) 



G. Bateman 


101 


In the same way 


where 


p {9 < 0o I h r t} = ffr ££ (2^(«, t, g $ g 0 ) + <p(t + 1 ,t t, g < g 0 ) +#, t + 1, g ^ ? 0 ))J , 
0M..lK0o) = S VHM*0) = ft ( S ) for < 1 

0 < 0 . 1 - 1 V /-0 / 


and 


= 0 for | *i — <a | > 1 



Table 2. The joint distribution of T and gfor the case r x - 14, r, = 6 


( T = number of groups in the sequence) 


\f 

r\ 


B 

— 

12 

11 

10 

9 

8 

7 

6 

5 

4 

3 

2 

Total 

2 


■ 












m 

3 

5 

2 


2 

2 

2 

2 

1 

, 




m 

mm 

4 


20 

20 

20 

20 

20 


10 

, 

, 




■n 

5 


20 

35 

60 

65 

80 

96 

100 

60 

15 




520 

6 



00 

120 

180 

240 


360 

240 

60 




1560 

7 



30 

100 

210 

360 


780 


610 

mSm 




8 




80 

240 

480 

800 

1200 

1560 

1160 

HS 



%SL 

9 




20 

110 

320 

■ESI 

1300 

2140 

mmm 

mmm 

60 


tef? 1 

10 





50 


K&9 

1000 

1750 


EH 

50 


■ 

11 




. 

5 

50 

19 

550 

1225 

2255 

m 

455 


m 

12 


# 


, 


12 

60 

180 

420 

810 

912 

180 


2574 

13 

■ 

■ 


• 


• 

7 

42 

147 

392 

735 

392 

1 

1716 

Total 

1 

B 


302 

882 

1764 

3234 

5523 

8442 

10192 






It may be noted that if longest runs of the E '& only are considered then (2) and (3) reduce 
to 2/ 1 (<, g) and f x (t + l,g) r *“ 1 6' ( „ 1 +f x (t, g) r ' l C, respectively. 

If g s be used to denote the length of the longest run of E's in a sequence, then 

r,-*+l 

N{g s = 8 1 r x rj} - 2 A(<> «) [ ,r_1 G- 2 + 2 r *~ 1 G-i + ^ 

= s AM r * +1 G, 

<-l 

and = "‘If W, S 

Writing = , «+ 1 C i r *“ ;f+1 CJ_y, interchanging the order of summation and using the 

m 

relation that £ m G k+i n C t = m+n C* +H , we have that 

i-0 

P{9e H»W = ~ [S* 1 ( " ) m r * + ^ r -^,] • 

This result will be referred to in § 9. 

3. Conditional power function for the simple markoff chain 

It is dear that the length of the longest run may be used as a criterion for testing randomness 
in a sequence in the same way as the number of groups or runs. If a longest run of length g 0 
be observed, then, as shown in § 2, the probability of obtaining a run of length greater than 














102 


Power function of the longest-run test 


or equal to g 0 , or of length less than or equal to g 0 , can be found. If either of these probabilities 
is less than some arbitrarily assigned significance level, then we may reject H 0 , the hypo- 
thesis of randomness. But the possible alternative hypotheses will be different if we judge 
on the upper tail (i.e. for g large) or the lower tail (i.e. for g small). When the observed g 0 is 
so large that the hypothesis H a is rejected, then a possible alternative hypothesis, H v might 
be that there is positive dependence of some kind in the sequence; that is to say the elements 
are not independent and given that E has occurred it becomes more likely that it will occur 
again. On the other hand if the observed g 0 is small, then a possible alternative hypothesis 
might be that there is negative dependence in the sequence. 

Consider a sequence of possible events e v e a , . . . , e r which may or may not be independent; 
it is well known that r 

P{ei e 2 . . . e r } = ft -P& K e 2 . . . 

1 

If the events are independent then the relation reduces to 

P{e 1 e 2 ... e r } = fl P{eJ. 

i= 1 

This is the basis of H 0 , the hypothesis of randomness. 

If there is dependence as in a simple Markoff chain, each event will be dependent only on 
the event immediately preceding it, and we shall have 


P{e 1 e 2 ...e r }= TI k_i}- 
1*^1 


This is the basis of H v the possible alternative hypothesis to H 0 , and we shall take into account 
only the case where the dependence is positive. 

Now consider a sequence of r events composed of alternatives E or E y and let E i denote 
the happening of the event E at the ith trial. H 0 and H x will be as above with capital letters 
replacing small ones. Under the binomial hypothesis, i.e. under H 0 , we write 

P{Et} — p and P{I£J = q for i = 1, 2, where p + q = 1. 


and we shall have 


P{g\rxr t HJ 


N(g 1 r 1 r 2 )p , y« = N(g | r^r a ) 
2 M(g\r 1 r i )p r iq r * r C\ 


The hypothesis, H 0 , is rejected iig^g a where 

P{<!> 9 a \r 1 r 2 H 0 } = a, (5) 

and g a is chosen so that a is as near as possible to the chosen significance level. Such values 
of g a are shown in Table 3 for r = 10, 15, 20, and a in the neighbourhood of 0-05. This 
table could clearly be extended if desired. 

As a likely alternative to H 0 , we take H v and suppose that 

P{P X } = P, = Q where P + Q= 1, 

= P{E i \E J _ 1 } = Q 1 , 

P{E ( | E^} = P 2 , P{E { | E^} = Q 2 where jFJ + Q, = 1 for j = 1, 2. 

The conditional power function P{g | r 1 r 2 H 1 } follows in a straightforward way by considering 
the partitions of r 1 and r 2 as was done to obtain the distribution of g under H 0 . For each par- 
tition within a given number of 2 1 or 2t + 1 groups the multiplying probabilities are the same, 
for all that matters is the number of transitions from E to E and back again. Thus for a given 



103 


0. Bateman 

sequence of 2t groups beginning with E there are 2( — 1 transitions, f from E to E and t — 1 
from E to E and the remaining r 1 —t and r g — f are permanences of E and E respectively. 
The probability of obtaining a given sequence of 2 t groups is equal to 

which may be written 0^ + (^p ) #»*• 

In the same way the probability of obtaining a given sequence of t + 1 groups of E and t 
of E is ^ Qt', and of t groups of E and t + 1 of E is ~ P\* (fy. 


Table 3. Values of g a such that P{g > g a | r t r g } = a, where a is as near as possible to 0-05 


(The value of a is given in brackets) 


\ r 





10 

15 

20 

r i or r t 




2 

8 

13 

17 


(0-007) 

(0-029) 

(0-047) 

3 

7 

11 

15 


(0-033) 

(0-035) 

(0-035) 

4 

0 

9 

12 


(0-024) 

(0-055) 

(0-030) 

5 

5 

8 

11 


(0-040) 

(0-042) 

(0-049) 

0 

— 

7 

10 



(0-039) 

(0-038) 

7 

— 

6 

9 



(0-053) 

(0*034) 

8 

— 

— 

8 




(0*035) 

9 

— 

— 

7 




(0-040) 

10 

— 

— 

7 




(0-032) 

i 


The joint probability distribution of 2 1 and g is given by 

m 


P{2t,g\r l r i H 1 } = 





<P 

<P* 

qJ 

{PiQi 

\Q*Px> 

i 

S| 
*=.1 1 

(p 9 q: 

[q*Pi> 


P 

p + 

\ r i 

Q' 

Qii 



and similarly 
P{2t+\,g\r x r i H i \ 


i 

</>(t+l,t,g)-p+ < f>(t,t + l,g)-^^ 

(P 2 Q 1 

\Q*Pv 


ff i 
2 
l-l’ 

( PiQi) 



'P Q ] 
&Qx. 

\ +r x -l C r t -X Gt 





104 


Power function of the longest-run test 


The probability distribution of g is obtained by summing over all t from < = 1 to 0+1, 
that is to say 


P{g\r 1 r i H^ = 


ri— 0+1 1 . 

2 

'P , Q) 
AQii 

|+0(<+ 1,<, g)-p^+<f>(t>t+ 

(P*Qi) 

i 

r# 1 
2 
t-l\ 

myj 

m*pJ\ 




-'-O 


( 6 ) 


For the power of the test we require P{g > g a | r x r t H^, where g a is given by (6). This 
probability is obtained by summing the left-hand side of (6) from g = g a to r v and in practice 
it is simpler to sum first with respect to g and then with respect to t using the expression for 
n 

S #i> h>9) given in §2. 


a— a > 


4. Comparison op certain power curves for the criteria, length 

OP LONGEST RUN AND NUMBER OF GROUPS 


Throughout the computation of the power curves it has been assumed that 

P = PP X + QP 2 . (7) 

This condition is arrived at by using the relation P{PJ = P{E { _ X E t } + P{E i _ 1 PJ on the 
assumption that P{E f } = P and P{PJ = Q for all i\ that is to say we are assuming that the 
probability of the event E occurring at the ith trial when nothing is known about the results 
of the preceding trials is independent of i. Tliis in effect implies that the start of the sequence 
of observations is a randomly selected point in a longer sequence following the same law. 

The power function for either criterion is given as a function of the three parameters 
P, and P 2 , but, on application of (7), reduces to a function of two parameters. If we take 
these as P x and P 2 , we note that P is a constant, k, on straight lines whose equations are 
kP 1 + ( 1 — &) P 2 = 1c. The power functions for the number of groups criterion, as plotted by 
David for P = 0*6, 0-6 and 0-75, are then sections of the power surface cutting the (P v P 2 ) 
plane in the straight lines (1), (2) and (3) shown in Fig. 1 . As the alternative hypothesis under 
consideration is P 1 >P 2 , the curves have been taken as starting at the central diagonal 
Pj = P 2 = P, that is to say we are not interested in the power for P x < P 2 . Fig. 1 shows con- 
tours of the power surface for the number of groups criterion when r x = r 2 = 10 and tho 
significance level is 0-05. The probability of establishing a difference, using this significance 
level, when P 1 >P i is about 0-22, 0-56 and 0-89 respectively on the three contours shown. 

Alternatively we could express the power function in terms of P and P x — P 2 . If, following 
Markoff (1913, p. 45), we take P x — P t — S, then 1 8 1 is a measure of the degree of dependence 
in the sequence, and the sign of d indicates the direction of dependence ( + for positive 
dependence, — for negative dependence). When S = 0, the observations are independent. 
Using (7) it follows that 

P 1 = P+6Q, Q 1 = Q(1-S), P 2 = P(l-S), Q^Q+'SP, 

and the power function for either criterion may be expressed in terms of P and S. The power 
curves for the two criteria when the alternative hypothesis is 8 > 0 are shown in Fig. 2 for 
r i — r a — r i — 14, r 2 = 0 and P = 0-5, 0-6, 0-75.* The chance, a, of rejecting H 0 when it 
is true is different for the two criteria; this is inevitable owing to the discontinuity of the dis- 

* When P — 0-5 and P = 0-0 the power curves are so close as to be indistinguishable on the graphs. 
This can be seen, too, from the contours in Fig. 1, for the contours are very nearly parallel to the 
diagonal S—P 1 —P t = <i, for this range of P. 



G. Bateman 


105 


tributions. For example, when r x = 14, r t = 6, P{g > 10 1 H^} = 0*038 (see Table 3), while 
P{T < 6 1 H 0 } = 0*058. These are the values of a nearest to the 0*05 level in each case. But, 
allowing for this difference, it is seen on comparing similar curves for the two criteria that T, 
the number of groups criterion, is the more powerful in detecting departures from randomness 
of the single dependence kind. This might perhaps have been expected on intuitive grounds. 



The 5% significance level has been used. 

Fig. 1. Contours of the power surface for the number of groups criterion, T. (r x ~r t — 10) 



Comparison of power curves for T and g. 


Fig. 2. Case where the alternative hypothesis is positive dependence of the simple kind : 
— Ffrcrjrjr.HJ 

For curves (i) P — 0*6 or 0*8 (see footnote to $ 4). For ourves (ii) P?=- 0*76. 





106 


Power function of the longest-run test 

It can further be noted that for either criterion the power is greater when r x = r a , though 
the difference is more marked for the length of the longest run. In quality control work it is 
usual to consider runs above and below the median rather than above and below the mean. 
This ensures that r x = r 2 and gives increased power to the test. 

6. The distribution of the number of groups when the hypothesis 

IS THAT OF DOUBLE DEPENDENCE IN THE CHAIN 

In § 4 it has been shown that T is a more powerful criterion than g for detecting departures 
from randomness when the alternative hypothesis is that of single dependence. It would be 
interesting to compare the powers of the tests when the alternative hypothesis is dependence 
of a different kind. Unfortunately the formulae become increasingly complex as the number 
of parameters defining the sequence is increased. We shall, therefore, deal only with the case 
of what could be called double dependence or dependence of the second order. 

It is assumed in this case that each event is dependent only on the two events immediately 

r 

preceding it. The general formula P{e x e 2 ... e r } = f[ P{ e v | ^ e 2 . . . e i-i) then reduces to 

r i-i 

P{e 1 e 2 . . . e r ) = JJ P{ e i I This is the basis of hypothesis H 2 , and it is further assumed 

i=i 

under H % that 

(i) when nothing is known about the results of the (i — 2)th and (i — l)th trials, 

P{PJ = P and P{PJ = Q, where P+Q = 1 ; 

(ii) when the result of the (i — l)th trial is known, but not the result of the (i — 2)th trial, 

P{E i \E t . 1 } = P 1 , P{E i \E i _ 1 }^Q v 

P{Ei | E{_ x } = P 2 , P{E { | E { _ x } = Q z , where P } + Q f = 1 for j = 1,2; 

(iii) when the result of the (i — 2)th and (i — l)th trials are known, 

P{E t | Ei_ 2 E^} = Pv P{E i | P ,-_ 2 E { _ x } = q v 

P{E { | E^E^} = ft, P{E\ | E^E^} = q 2 , 

P{E i | E^E^} = ft, P{E\ | E^E^} = q 3 , 

P{E { | Pi_ 2 Pi-i} = P t , P{E t | Pi-. 2 P.-_ 1 } = ? 4 , where p } + q f = 1 for j = 1 , 2, 3, 4. 

Using the relations P{PJ = P{E i _ 1 E { \ + P{E i _ 1 E { }, 

P{E_i-iEi} - P{Pi_«Pi_ a P f } + P{Pi_ 2 Pi_iPi} 
and P{E t _ x E { } = P{Pi_ 2 E t . x E t ) + P{P ; _ 2 E t _ x E t ), 

we obtain condition (7), i.e. P = PP X + QP 2 , 

together with P x p x + Q 1 p 3 = P v > (8) 

and PiPi+QiPi = P 2 - ( 9 ) 

These conditions again ensure that the start of the sequence is a point chosen at random in 
a longer sequence, and can be written in the form 

p = Pa p = Pi p = Pa = ^(l-yi + Pa) 

1 1-.P1+1V 2 l ~Pz+Pi I -Pi + P 2 p i (l-p l +p 3 ) + {l-Pi)( l -Pi+Pt)' 

Thus the problem is reduced to one with four parameters p v p a , p a , p t . 



G. Bateman 


107 


To construct the power function under H % it is necessary to consider not only the sequences 
of 2 1 or ‘2t + 1 groups, but also the subset of such sequences containing a specified number of 
single E ’ s and of single E’ s. If the r l E’s are partitioned into t parts of which m 1 parts consist 
of a single E, then there are t — m 1 parts consisting of two or more E’ s, and the number of 
times that E follows EE is r l — m l — 2 (t — m x ), i.e. r x — 2t + m v Similar results hold if the r 2 E’s 
are partitioned into t parts of which m 2 parts consist of single E’s. Suppose this sequence of 
2 1 groups starts with at least two consecutive E’s and ends with at least two consecutive E’s. 
Then it can easily be seen that the probability of obtaining such a sequence is 
PE [ q, q r i *- 2i + m '(p 3 q i y- m r -* 


which we may write for brevity — -z, where 

PsPi 


n 


PaPMi<h\ 


and refer to as sequence (i). The results for sequences (ii) starting EE, ending EE, (iii) starting 

_ pp pn PQ 

EE, ending EE, (iv) starting EE, ending EE are — -z, -z, — -z respectively. Four 

PzPa Pitta Pitta 

corresponding results are obtained when the E' s and E’ s in (i), (ii), (iii) and (iv) are inter- 
changed. 

For a sequence of 2t -f 1 groups the subdivisions are into (a)t+ 1 groups of r x E 9 s containing 
m 1 single iJ’s and t groups of r 2 E's containing ra 2 single E’ s, and ( b ) t groups of r x E' s containing 
m l single E's and t + 1 groups of r 2 E' s containing m 2 single E 9 s. Under (a) the sequences 
considered are those (i) starting EE, ending EE, (ii) starting EE, ending EE, (iii) starting 
EE, ending EE, (iv) starting EE, ending EE; and the probabilities of obtaining these 
PP X PPxQi PQ\pz PQi P*<Ii * TT j ... r ,. 

marmnn nnc n * * ± x ^ ± - rt ruonunfum I \r I nrlat» (hi t/m 1 r* nnrmtmrmn in nr 


sequences are — -- z, — -- - - z, - 
p\ pi % 

subdivisions are considered. 


2 — z, — -- 2 respectively. Under (6) four corresponding 

.Pi % Pi 


Finally, the number of ways in which each such sequence can occur is required. The number 
of compositions of {r 1 - m x ) E’s into (t - m x ) parts when no single E occurs is r i for 
m 1 — 0, 1, 1 , and is unity fur m 1 = t = r^Thisis the coefficient ofx r > _ '"» in (a: 2 + a: 3 + 

Since the m x single E’s can occupy the t spaces in l C mi ways, it follows that the number of 
compositions of r L E’s into t parts, m x of which are single E’s, is Tl ~ t ~ 1 G t _ mi _ 1 . t C mi . If this be 
denoted by then the number of sequences of 2 1 groups containing m, single E’s 

and m 2 single E’s is 2A 1 (i,m 1 )A 2 (t,m 2 ) and the numbers of sequences of 2 1 groups specified 
by (i), (ii), (iii), (iv) are given by the product of h 2 (t, m 2 ) with 

(t-mjit-m,) (t-m l )m 2 m x m 2 

Ji . £2 ’ (Z > 


respectively. Similar results hold in the other cases. 

Combining the results and summing for all m, and m 2 , we obtain 
A B 

P{2t | r x r a H^ = and P{2i+ 1 1 r 1 r a H i } = ^ for t = 1,2, ...,r 2 and r x ^r 2 , 
where 

V V /Piga\ m> /Pag«\ m V p3P4gig2 V ^i( t » w i)^a( < > w a) 
io^o\Ps?i/ \PM \ Pl?4 / P 


[(<-«»i)(<-w 2 )^ 


aPi <mJ 


/ pp Qp \ 

+ (t-m 1 )m 2 i — 1 + — l + m^-mj) 

1 \PaPa Pi<hJ 


(QQ* + PQi 

^?a?3 PiQa 


j +ra l m 2 ^ 


PQx+QPJ 



108 


Power function of the longest-run test 


B = v 1 ‘v 

n£ov£.o\p a qj \VM V Pl<A ) 


X 


0 


h>i(t + 1, wij) h a {t, wig) P 


PI 

Pi 


( t 


t(t+l) 


+ Kl(mi _ !)&!&) 


-m 1 + l)(f-m 1 )P 1 + m 1 0-m 1 +l) 
&i(<,mi)A 2 (*-H,ro 2 ) Q 

<(<+!) gf 


•Pigi + <giPa 

?S 


x |(< - m a + 1 ) (t - wi 2 ) Q a + m a (t -m 2 + 1 + m a (m 2 - 1 ) |j 

C = £ (.4 + P), and where *<(<,»»<) = for m i << 

<“1 

= 1 for m i = t, i = 1,2. 

Using conditions (7), (8), (9) these expressions simplify to some extent, and we may note: 

(1) If Pi = Pi = Pa = Pt, then P = Pj = P 2 and P{T | r 1 r 2 H a } becomes P{T \ r x r a H f) }, for 
A reduces to 2 r i -1 C7 < _ 1 r * _1 CJ_ 1 and B to ri ~ 1 C t r *~ l C l _ 1 + r ‘ -1 C ( _ 1 r,_1 0,. 

(2) = p,andp a =p 4 ,thenii = p x and P 2 = p 2 and P{ T [ r 1 r 2 H 2 } becomes P{ T | r x r 2 PJ, 

for A reduces to ^+^-j r ‘ _1 CJ_ 1 r * _1 C' < _ 1 , and B reduces to 

(3) If Pi — P a and p 3 = p A , then P = P 1 — P 2 and we have what might be called ‘ throw- 
back’ dependence, for the result of the ith trial depends only on the result of the (i - 2)th 
trial, and in fact the odd and even events in the sequence form independent sequences. The 
group test will be of little use unless we are interested in both positive and negative dependence 
as the alternative hypothesis. 

(4) If p 2 = p s , then the dependence is of the ‘global ’ type, for each event depends only on 
the number of ‘ successes ’ in the preceding two trials. 


6. The distribution of the length of the longest run under hypothesis H a 

This distribution can be obtained in a manner similar to the foregoing, but the formula is 
even more unwieldy. We shall consider, therefore, only P{g 5t a \ Suppose h t (t, nitf < s), 

where i = 1, 2, is the number of compositions of r i elements into t parts of which m t parts 
contain one and only one element and such that no part contains a or more than a elements. 
Then 

= for « = 3,4, ...,r t + 1 

= 1 for 8 — 2, 

* 

where the expression in the square brackets is the coefficient of x r <~ m < in the expansion of 
(x* + x 8 + . . . + x*-i )<-»*<. Clearly h(t, m { ,g<r { + 1) = h(t, m { ). The . number of sequences of 
2 1 groups containing m 1 single E’e and m t single E’s and such that the longest run of either 
E'a or E’a has length greater than or equal to a is 

2[A X (<, to x ) h a (t, m a ) - h^t, m v g<a) h a (t, m a , g < «)], 

t 2 s n-m \-i 

n r <- t - l c,- mi -i l c m( - n ( s o ( - • 



G. Bateman 


109 


If we denote by A' the expression obtained on replacing h x (t, m x ) h a (t, m t ) in A (see §5) by 
h x (t, m x ) h a (t, m a ) — h x (t, m v g<a) h a (t, m a , g<a), and if we obtain B' from B in a similar way, 
then r,— »+i 

S (A' + B') 

P{g>»\ r i r 2^z} - - j t~ • 

S (A + B) 

t-i 

7. Comparison op certain power curves for the criteria T and g 

WHEN THE ALTERNATIVE HYPOTHESIS IS DOUBLE DEPENDENCE 

In plotting the power curves, only the cases where 

p 1 +p i =l=p a +p a (10) 

have been considered. Using the relations previously obtained, namely 

P p»_ p _ P* P - A 

'~l-Pl + P2 *~1~P2 + Pi T -Pl + P*’ 

it follows from (10) that P x = Q a , P a — Q x and P = 0-6. This seems a reasonable case to take 
for it merely implies symmetry with respect to E and E. Writing the relations out in full we 
have P = 0-5, P x = Q a , P 2 = Q x , p t — q x , p x = q t , p 2 — q a and p 3 = q a . The power functions 
can now be expressed in terms of two parameters, say p x and p a . Three sections of this power 
surface have been considered for r x = r a = 10, namely sections by the planes (a) p x = p a 
(H x in Fig. 2), (6) p 3 = 0-5 (H a in Fig. 3), (c) *p 3 -2p 1 = 1 ( H a in Fig. 3). 


f \ « n ~ 1 0, ultcrnalivr liyjtothrsH ffj (/> 3 -0-S) alternative hypothec //."( l/> 3 — ** J ) 



Scale of p\ Scale of p\ 

Fig. 3. Case whore the alternative hypothesis is double dependence: 
— P{T^T a \r x r t H t ] P{g>9 a -\r x r t H t } 


(a) When pj = p 8 , as it does along the central diagonal in the (p x ,p 3 ) plane, the dependence 
in the sequence is of the single kind, and when p x — p a = 0 - 5 then there i8 independence. 

(ft) Whenpj = 0*6, then the dependence is ofthe global kind for the probability of obtaining 
a success at any trial depends on whether there were 2, 1, 0 successes in the preceding two 
trials; and in the case taken these probabilities decrease in arithmetic progression, e.g. when 
p x = 0-7, p a = p z = 0-5, p 4 - 0-3. 





110 


Power function of the longest-run test 

(c) When 4 p 3 —2p 1 = 1, the dependence is intermediate between these two types. Typioal 
values of p v p a , p 3 , p t are shown in the following table: 


Pi 0-5 

0-6 

0-7 

0*8 

0*9 

P» 0-5 

0*55 

0*6 

0*65 

0*7 

Pt 0-6 

0-45 

0-4 

0*35 

0*3 

Pi 0-5 

0-4 

0-3 

0*2 

0*1 


It will be seen that, apart from the first column in which the events are independent, the 
probability of an event E occurring decreases according as the results of the two preceding 
trials are EE, EE, EE or EE. 

When the alternative hypothesis is single dependence, then, as we have noted previously, 
the g criterion is considerably less powerful than the T criterion in detecting departures from 
randomness. This point is illustrated by the power curves of Fig. 2. The power curves of 
Fig. 3 show that this is not necessarily the case when the alternative hypothesis is double 
dependence, for in one of the two cases considered the g criterion would appear to be no less 
powerful than T in detecting departures from the basic hypothesis. 

It is realized that this investigation is not yet complete, but further work is being done 
on types of dependence likely to occur in practice, and on the possibility of using an estimate 
of the degree of dependence in a sequence as a test criterion. 

8. Illustration of dependence in a sequence 

As an illustration of sequences which might show dependence, two passages have been taken 
at random, the first from a standard text-book on Statistics and the second from Gertrude 
Stein’s writings. The event, E, recorded is the occurrence of a word of one syllable, while E 
is the occurrence of a word of more than one syllable. To eliminate possible end effects the 


First sample Second sample 

Present word Present word 








G. Bateman 


111 


first three and the last three words in each sentence have not been counted. The results for 
the individual sentences have been pooled to obtain the above frequency tables. The total 
number of words counted was 600 in each case. 

Estimates of the probabilities, relative probabilities and degree of dependence ( 6 ) are 
shown in the following table: 



A 

P 

A 

Pi 

A 

P. 

A 

Pi 

Pi 

A 

Pi 

A 

Pi 

A 

6 

1st sample 

2nd sample 

. 

i 

0-66 

0-78 

0-59 

0-77 

0-76 

0*81 

0*56 

0*76 

0*70 

0*78 

0*64 

0*80 

0*78 

0*91 

-0*17 

-0*04 


If we take as our hypothesis that P x = P 2 — P, then, applying the x a -test for independence 
to the 2x2 tables above, we obtain values for x 2 of 13*62 for the first sample and 0-49 for the 
second sample. On this evidence we reject the hypothesis of independence in the first case, 
but not in the second case. The first sample does, in fact, seem to exhibit a degree of negative 
dependence of the single kind, i.e. given a monosyllabic (or polysyllabic) word the chance of 
its being followed by another monosyllabic (or polysyllabic) word is less than it would be 
if the events were independent. It is not clear whether or not the sequence shows double 
dependence. Applying the x 2 -test to the 2x4 table we again get a significant value for y 2 , 
which leads us to reject the hypothesis, p x = p a = p 3 — p t = P ; but it might still be the case 
that Pi — Pz — P\ and p a = p t = P 2 , when the double dependence would reduce to single 
dependence. 

In the second sample it might well be that P = P t = P 2 = p t = p 2 — Pz — P*, the fluctua- 
tions in the estimates being due to chance, and the x 2 -test applied to the 2x2 and to the 2x4 
table confirms this. Thus in Gertrude Stein’s work monosyllabic and polysyllabic words 
would seem to occur at random. 

9. Application of the distribution of the longest run 

TO THE CLASSICAL PROBLEM OF * RUNS OF LUCK ’ 

The problem of ‘ runs of luck ’ was first put forward and solved by de Moivre. It is concerned 
with finding the probability that an event E occurs at least s times in succession in a series 
of r independent trials, when the probability that the event E occurs is constant and equal 
top. The problem has been solved using difference equations. The alternative solution given 
below is based on the distribution of the longest run. 

It has been shown in § 2 that, if g E is the length of the longest run of P’s, then 

[r,/s] 

P{ 9 u> 8 \r 1 rz} = £ (-) i+lrt+1 C/- }a C r J r C r , where i \ + r 2 = r. 
i-i 

If r (the number of independent trials) is given and P{E] is constant and equal to p, then 
P{r x ) = 

and P{g E > *} = £ P{g E >«\ rj x P^} 

r x **8 

= £ £ (-)Glr,+lC<r-*.q _ pr^r, 

1 r x >je J 




112 


Power function of the longest-run test 


Using the relation £ n C k p k <J n ~ k — 1, we obtain 

fc -0 

PfltM - ( S (-y«'-*g,- 1 y<V-»(l + ’'- j( * t 1>+ * }) 

■ ^ (-y +i ^p+— 

% 

This can readily be shown to be identical with the solution using difference equations, as 
given by Uspensky (1937, p. 77). 

I wish to thank Dr F. N. David for suggesting the subject and for advice given in the 
preparation of this paper. 


REFERENCES 

David, F. N. (1947). Biometrika , 34, 336. 

Markoff, A. A. (1913). Supplement to 3rd edition of Calcul des probability. St Petersbourg. 
Mosteller, Frederick (1941). Ann. Math. Statist. 12, 228. 

Uspensky, J. V. (1937). Introduction to Mathematical Probability. McGraw-Hill Book Company. 



[ 113 ] 


SUR LES COURBES DE FREQUENCE DE K. PEARSON 

Par M. DUMAS, Inginieur en chef de I'ArtiUerie navale ( Marine nationale frangaiae) 

1. K. Pearson a indiqu6 (1896, 1916) comment il ytait possible, connaissant les rapports 
homogdnes et /£, d’une distribution, de calculer liquation de la courbe des density d’une 
loi de probability ajustant cette distribution. II a dd oonsid6rer differents cas, suivant la 
region du plan des dans laquelle se situe le point intyressant. Ces rygions sont indiquyes 
avec pryoision dans une figure* (K. Pearson, 1916), reproduite en tSte de la Part II des 
Tables for Statisticians and Biometricians (K. Pearson, 1931). 

Nous nous proposons dans le prysent mymoire d’attirer l’attention sur oe qu’il est utile 

de completer les indications de la figure susvisye en tragant une nouvelle courbe. Pour 

l’exposy correspondant, nous conservons les notations utilisyes aux p. lx et suivantes de la 

Part I, ydition de 1930, oil se trouve un tableau des yquations des courbes des fonotions y(x) 

de K. Pearson. Nous rappelons que liquation diffyrentielle servant de point de depart k la 

thyorie est . . _ „ 

1 dy x a 

ydx ~ c Q + c x x + c 2 x 2 ' 

2. Celles des lignes trac6es sur la figure susvisye qui sont les plus int^ressantes k consid^rer 
du point de vue des formes des courbes representatives des fonctions y(x) sont: 

(а) la droite: 2fi t - Sp x - 6 = 0; (1) 

(б) la cubique et la biquadratique ayant respeotivement pour equations: 

+ - 0 , ( 2 ) 

2 + 3) 1 (8/? 2 - 9p x - 1 2) - 4(4/? 2 - 3fi x ) - 6 fi x - 9)* = 0. (3) 

La biquadratique definie par (3) peut etre dite biquadratique de discontinuity des ordonnies, 
car on peut etablir ce qui suit: si Ton considers les courbes des fonctions y(x) correspondant 
respeotivement aux ft x et de deux points voisins de la biquadratique (3) mais situ6s de 
part et d’autre de celle-ci, Tune de ces courbes a une ordonn6e nulle en un point ayant pour 
abscisse une solution de liquation: 

Cq 4* c x x -f" c 2 x* = 0, (4) 

tandis que l’autre courbe a, au point correspondant au pr6c6dent, une ordonn^e infiniment 
grande. 

C’est ainsi par exemple, que la traverse de la branche marquee VIII sur la figure de la 
Part II susvis6e, fait passer la courbe y(x) du type I v au type Ij (comme oela est rappele sur 
les figures 2 et 4 ci-jointes), en passant d’ailleurs par le type VIII (Fig. 3). 

H est commode, notamment pour la construction des courbes, de remplacer les Equations 
(2) et (3) par des Equations parametriques, en prenant pour param&tres fi x et le coefficient 
c 2 de la quadratique (4). On 6tablit en effet que les Equations susvis6es sont 6quivalentes 

k la relation 2(1 + 5c,)/?, - 3(1 + 4c,) /?,+ 6(1 + 3c,), (6) 

* Cette figure present© notamment l’avantage d’indiquer oomme limite de la partie utile du plan 
des la droite P% — ft 1 = 0 , et d’exclure par suite la droite 4^-3 fi x = 0 , indiqu 6 e dans la figure 
xxxvi de la Port I des mfcmes ‘Tables’. 

Biometrik* 35 


8 



114 


Sur les courbes de frequence de K. Pearson 


complete en oe qui oonoeme la cubique (2) par 

n _ 16Cg(l +3</g) j 
Pl ~ (l + 4c 8 )» ’ 

et en ce qui oonoeme la biquadratique (3) par 

4(1 + 3c 2 ) 

Pl (l + 4c 2 )*(l + c a )’ 


(6) 

( 7 ) 


3. L’axe des fi 2 , la droite fi 2 —fi 1 —l = 0, la droite (1), et les courbes (2) et (3) limitent 
di£F6rentes regions du plan des fiifitl oe sont les seules regions oonsid6r6es par K. Pearson 
pour definir ses types de courbes; pour chacune de ces regions, une seule et meme expression 
y(x) est valable. Mais il se trouve que dans oertaines de ces regions liquation correspond a 
des formes & courbes nettement differentes les unes des autres, car aux points oh y = 0, la 
pente de la tangente k la courbe y(x) peut etre soit nulle, soit infiniment grande, soit encore, 
comme 6 tat intermediate, finie non nulle. * 

Nous avan^ons que tous les points pour lesquels ces pentes sont finies non nulles, sont 
ceux situ6s sur la courbe d’6quation 

fixifit + 3) a (¥a - ¥i - ») ■ - (4& - 3A) (7 fit -9fii~l 5) 2 = 0. (8) 

Soit a d^montrer cette proposition. 

Rappelons d’abord que la theorie de K. Pearson donne pour les coefficients de la quad- 
ratique (4) les expressions 

c 0 = — + 3c a ) et c x = — Je(l + 4c a ) (®) 


(e = + 1 ayant le signe de // 8 ), auxquelles s’ajoute liquation (5) donnant c 2 en fonction dc 
fi x et de fi v 

Puis, appelant x x et x 2 , avec x x < x 2 , les solutions reelles de liquation (4), c’est h dire 

_ — c x + ,y/(c 2 — 4c 0 c a ) 


x. 


posons 


d’apr^s (9) et (10) 


to , = 


2c, 


et to, = 


TOj 

TOj 


J_ 

2c a 2 


C z( x 2~ X l) 2 C 2( X 2~ X l) 

(1 + 2c 2 )(1+4c 2 )V/? 1 


2c 2 V[/?i(1 + 4c 2 ) 2 + 16c 2 (1 + 3c 2 )] ' 

Considerons maintenant le cas particulier oil la solution y{x) a pour expression 

y(x) = k(x — x 1 ) m i(x — x i ) m ‘; 


( 10 ) 

( 11 ) 

( 12 ) 


c’est le cas oh, pour fi z > 0, la courbe representative est du type VI de K. Pearson. Dans 

06 cas dy(x) k . . 

ckT = ~c} X ~ Cl ^ X ~ “ x *T t • 

QAl(x\ * 

Manifestement, on ne peut avoir h la fois y(x) nul et ■ ^ ' fini non nul que pour x = x x 

(ou x — x t ), & condition d’avoir en meme temps to x = 1 (ou to 2 = 1), c’est h dire, d’aprds (12), 

k condition d’avoir / 1_V»_ /? 1 (l + 2c a ) 2 (l + 4c a ) 2 

V 2 cj ~ 4c 2 [y? 1 (l + 4c 2 ) 2 + 16c 2 (1 + 3c a )] ' 

Si dans cette expression on porte la valeur de c 2 deduite de (5), on trouve (8). 

Nous ne croyons pas utile d’indiquer ici la discussion complete, seule capable de montrer 
que la courbe d’6quation (8) est valable dans tous les cas et qu’elle est la seule valable. 



115 


M. Dtjmas 

4. Liquation (8) est liquation d’une nouvelle biquadratique qui peut 8tre dite, pour la 
distinguer de oelle qui a pour Equation (3), biquadratique, de discontinuity des dirivfos en 
raison de*la propri6t6 qui a conduit & la oonsid^rer. Ses Equations parametriques Bont 
liquation (5) com P 16t6e par 2(l + 3c a )(l-2c a ) a 


n (l + 4c a ) a 

/»• 

J 2 3 4 5 6 



Fig. 1. Limites de regions du plan des Pxp t . 

La partie util© de cetto biquadratique a deux branches dont Failure g6n6rale est oelle des 
deux branches de la biquadratique (3); l’une d’elles est asymptote a la droite d’6quation: 

l’autre est pour /? a infiniment grand, asymptote k 

fix « 39-2; 

oette derniere Equation est k comparer aux suivantes 

P x = 32, qui est oelle de l’asymptote k la cubique (2); 

fi x =x 50, qui est oelle d’une asymptote k la biquadratique (3). 


8-2 




116 


Sur les courbes de frequence de K. Pearson 



Figs. 2-12. Formes dee courbes y(x) pour fi x = 4 (c’est 4 dire le long de la section FF' de la figure 1). 
* D’aprds la figure de K. Pearson (1931) vis6e au § 1 du pr6sent m6moire. 

t Nous avons marqu6 lj bien que, sur la figure vis6e en la note pr6c6dente, aucun type ne soit 
indiqu6 dans la region correspondante. 



1 

m 


Fig. 14 

Fig. IS 


1 

m 





Points: 

de A i D (indu.) 

(pour D: J9 

de D a If (exdus.) Af 

(y>l*=0,/5|»15/7, courbe 
sytndtrique) 



Fig. 16 


Fig. 17 

\ Fig.,18 





X 

Vi 

Points: 

At M\L (exdus.) L 

(^,~0 32,yJ,«2-4) 

de L (exdus.) V B 


Figs, 13-18. Formes des courbes y(x) pour les points de la biquadratique de discontinuity des dyrivdes. 







M. Dumas 117 

Comme on le voit sur la figure 1, oes deux branches se raccordent tangentiellement k 
1’axe des fi t au point M s. o t ^ 

l’une d’elles ooupe la biquadratique de discontinuity des ordonn6es au point 

fix - 0-32, /?, => 2*4, 

qui est le point L de K. Pearson, tandis que l’autre branche coupe la droite d’6quation (1) 
au point D & = 2, /?, = 6. 

5. Pour donner une id6e d© l’int^rSt de la consid6ration de la biquadratique de discon- 
tinuity des dyrivyes, nous avons 6tabli les figures 2 k 18 ci-jointes. Les figures 2 & 12 indi- 
quent la succession des formes des courbes y(x) que Ton rencontre dans le cas oil, fi x restant 
constamment 6gal k 4, fi % croit depuis les environs de 5*5 jusqu’& ceux de 12; en raison de ces 
valeurs, le point reprysentatif rest© tou jours oompris entre le droite limite d’6quation 

1 = 0 , 

et la cubique d^quation (2), et passe par le point 

A = 4, A ® 9, 

qui est le point E, point exponential, de K. Pearson. 

On remarquera qu’au type Ij de K. Pearson (Figs. 6, 7 et 8) correspondent trois courbes 
d’ allures nettement diff^rentes Tune de l’autre, et qu’il en est de meme pour le type VI, 
(Figs. 10, 11 et 12.) 

Les figures 13 k 18 indiquent la succession des formes de courbes y(x) que Ton rencontre 
dans le cas oil le point /?*> fit suit la biquadratique de discontinuity des dyrivyes depuis A 
(Fig. 1) jusqu’a B en passant par M. 

Nous comptons publier prochainement im exposy complet de la thyorie mathymatique 
conduisant aux fonctions y(x)\ k cet exposy seront joints un tableau d’environ 34 figures 
donnant les allures des courbes y(x) pour tous les points possibles du plan des fi x fi 2 avec, en 
particular, les figures 2 It 18, et toutes prycisions analytiques sur les fonctions y(x), notam- 
ment sur oelles qui correspondent aux points de la biquadratique de discontinuity des dyrivyes 
(Figs. 13 k 18). 


BIBLIOGRAPHIE 

1. Pearson, K. (1895). Mathematioal contributions to the theory of evolution. Skew variation in 

homogeneous material. Philo s. Trans. A, 186, 343-414. 

2. Pearson, K. (1901). Philos. Trans . A, 197, 443-59. Suppiyment de 1. 

3. Pearson, K. (1916). Philos. Trans. A, 216, 429-57. Second suppiyment 1. 

4. Pe arson, K. (1930). Tables for Statisticians and Biometrioians . Part I (Edition de 1930). 

5. Pearson, K. (1931). Part II de 4. 

6. EiiDERTON, W. Palin (1937). Frequency -Curves and Correlation. Cambridge University Press. 



[ 118 ] 


THE DISTRIBUTION OF THE EXTREME DEVIATE FROM THE 
SAMPLE MEAN AND ITS STUDENTIZED FORM* 


By K. R. NAIR 


1. Introduction 

Denote by x^ . . . a\ n) a random sample of n observations drawn from any statistical universe. 
Let x v ...,x n be the same sample arranged in ascending order of magnitude so that x r is the 
rth ranked (or ordered) variate in the sample {x (i) }. 

Various authors have studied the sampling distribution of x r . It takes a simple form when 
r — 1 orn. Thus, if/(x) is the probability function of the universe from which x (i) ’s are drawn, 
the probability that x n < X is given by 

P(X) = []-J {X)dX ll‘ (1) 

If the parent universe is normal with mean /i and standard deviation <r 

/<x) -?vk) exp [-^]- (2) 

The probability that x n ^ X is the same as that of 

(3 ) 


and may be written 


p ^- p ^-hkL r, “ dx )'- 


(4) 


Tippett (1925) tabulated values of P(V) for n ranging between 3 and 1000. E. S. Pearson 
supplemented this by a table of percentage points of F, published as Table XXI bis in 
Tables for Statisticians and Biometricians , Part II, to facilitate tests of significance of a 
single outlying observation x 1 or x n , when both /i and a are known. 

Very often we do not know either /i or cr or both. When fi alone is unknown, Irwin (1925) 
suggested a test for a single outlier x n based on the statistic (x n — x n _ x )/cr. If there are k 
large outliers, his test would involve (x n _ k + x — x n _ k )j(r and, for k small outliers, the statistic 
(**+1 

Another statistic commonly used to test a single outlier, when ft is unknown, is one based 
on the range, viz. (x n — x 1 )j(T (see ‘Student’, 1927). 

Intuitively, one feels that a better criterion than (x n — x n _ 1 )ja‘ or (x n -r x,)/<r when there is 
only one outlier x n will be the extreme deviate (x„ - x)/cr proposed by McKay ( 1936). The range 
(x n — Xj) will certainly be a surer guide than (x n — x n _ 1 ) or (x n — x) if both x 1 and x n are outliers. 

In general, if there are k outliers at the lower end and l outliers at the upper end, a suitable 
criterion will be that based on the difference of the means of the first k and the last l outliers. 
When the outliers are all at one end, say the first k, we should write k + l — n. The distribution 
of this general statistic and some of its special cases are considered in § 6. 


* Part of a thesis approved for the degree of Ph.D. of the University of London. 



K. R. Naer 


119 


All these test criteria should be handled with extreme caution in practice. It may be 
appropriate here to quote Pearson & Chandrasekar’s (1938) warning: 

To base the choice of the test of a statistical hypothesis upon an inspection of the observations is 
a dangerous practice ; a study of the configuration of a sample is almost certain to reveal some feature, 
or features, which are exceptional if the hypothesis is true, . . . By choosing the feature most unf avourable 
to the hypothesis out of a very large number of features examined, it will usually be possible to find 
some reason for rejecting the hypothesis. 

Let us now consider the case where both p and or are unknown. Their sample estimates are 
* = 2 (»i )/n and a = J (x t - x) 2 /(n - 1)J . (5) 

If x n is to be tested as an outlier, the obvious course is to consider the modified 
McKay criterion ( x n — x)js . The sampling distribution of this test criterion is not known, but 
Thompson (1935) suggested an alternative method based on the sampling distribution of 

T U) = -*)/«, W 

where (see above) x {i) is a random member of the sample. He showed that the distribution 
of r can be derived from that of t (with ( n — 2) degrees of freedom) using the relation 

r „- i»Z-l>« (7) 

<J[nt 2 + n(n—2)] ' 

On a critical examination of Thompson’s criterion Pearson & Chandrasekar found that 
it is effective only when there is essentially one outlier and not many as Thompson seemed 
to believe. 

Suppose now that p is known and o* is unknown. Thompson’s criterion could be extended 

to this situation if we calculate ^ „x /o / /ox 

T (i) = (*tt)-/0/s > (o) 

where s' = (x i — /*) 2 /nJ . The distribution of r' can be obtained from that of t (with 
(n — 1) degrees of freedom) using the relation 

j’ _ m 

We now come to the main theme of this paper which is to consider a method of studentiza- 
tion different from Thompson’s. If an estimate s„ of the unknown cr is available with v 
degrees of freedom independent of the sample {%)}, a test for a single outlier can be made, 
using (# n — x)js v if /i is unknown, and using (x n - /i)/a„ if p is known. The distributions of these 
studentized test criteria can be derived by Hartley’s (1944) method. 

To test whether there are two outliers, one at each end, the studentized range (x n — z l )js v 
will be a suitable criterion. Pearson & Hartley (1943) have prepared the necessary tables 
for this test. 

In this paper attention is mainly concentrated on the McKay statistic, u = (x n — x)j(r 
(or (x-x x )I<t) and its studentized form (x n -*x)/s v (or (z-Xx)/^). It is shown that the dis- 
tribution of u can be obtained by a more direct method than was employed by McKay and 
that it can be reduced to certain integrals which have recently been termed G- functions 
by Godwin (1945) in his representation of the distribution of the mean deviation. Tables of 
the probability integral of u and of its studentized form have been prepared. 

Apart from serving as a criterion for rejection of an outlying observation in a ‘normal’ 
sample when p and a are unknown, the studentized form of u has useful application in judging 
the significance of a single outstanding treatment (best or worst) in a group of treatments 
tried out in a designed experiment. Some illustrations are given in the paper. 



120 


Distribution of the extreme deviate from the sample mean 


2. Distribution of the extreme deviate 

Assuming, without loss of generality, that fi = 0 and <r = 1 in the normal probability 
function (2), the joint distribution of the ordered variates x v ...,x n is 


»! r 1 " . 1 * 
^ r .«pL-2S4jn&<. 


[V(2j)]"- r L 2 . J i *' 

Mailing an orthogonal transformation of x l x n into y x y n defined by 

y i = { ix <+i -(**+ — + *i)} (» = 1 («-!)) 

“veVWj (,ay) ' 

1 n 

Vn = -]Z 2 (*<) = <Jnx, 

V n i 


it follows that 


n » (n-1) ~a 

2*? = 2^ = 2 iTTXn+nx*. 

i i i *(* + 1 ) 

n n i (n-1) 

II **-11 dy t = - — rr.dx IT dz t . 

l l (w-1)! i 


The joint distribution of x, z v . . . , 2 (n _ 1 ) is therefore 

n r nx* 1 <»-D z? “I .... 

tv<2S)F e:,i;p L — 2 — 3 ? .liTT)J <il n *<■ (U) 

Integrating out for x from — oo to + oo, the joint distribution of z v ..., 2 (n _ 1) is 

> T 1 <»-i> 2? I*"- 1 * 

[W“ p h ? 7 *- (I6) 

It will be noticed that 2 <n _ 1) = n(x n - x) — nu. (16) 

To derive the distribution of 2 (n _ 1) or of u, we have to integrate out for z lt . . . , z^j, in ( 16). 
The ranges of the z { are from 0 to + oo but they are interlinked by the inequalities 

... <2 ( „_ a) ^2 ( „_ 1 ) = nu. ( 17 ) 

At this stage we bring in the ©-functions defined by Godwin as follows : 

rx r fi -j 

/nr /_ v -i /v / . v I I * I /.v v. /i nv 


°” W - 1, <?,(*) . J*exp[-^A T) ] 0.-.( 


' Jo r L 2r(r+ 1)J 1 ' ' 

Integrating out z v ...,z n _ 2 from (15) between the limits (17), the distribution of « can 
be written , - , _ 

/ * M " ww= i “ p ["2(^r)] fi, "- ,< ’*“ ) ’ (19) 

and the probability integral of u is 

•„(«) -£/.(»>*> -0^3 


=J o /»(«) <?n-l(««)- ‘ (2°) 

Proceeding by an entirely different method, McKay derived an expression for P„(«) in 


the form 


PM - «p [-££] 


and obtained a recurrence formula 


/.<«> = 



K. R. Naib 


121 


The integral-power appearing on the right-hand side of (21) is the probability integral of 
(*„— /t)/<r given in (4), revealing an interesting connexion between the probability integrals 
of (x n — S)l<r and (x n -/i)l<r. 

McKay did not attempt to tabulate values of P n {u), but for getting the upper 5 or 1 % level 
of u, that is, when P n (u) — 0-95 or 0-99, suggested the approximation 




V(2*)I 


e-V*dt. 


uVtn/(»-l)) 


(23) 


With the help of exact values of P n (u) tabulated in the next section, it has been possible 
to examine the closeness of this approximation. The agreement is remarkably good. 

He also gave an approximate formula when u is very small, viz. 


PnM' 


n . / u \ n_1 

'~2 n (V(2^j) 


(n> 2). 


This does not give a good agreement with the exact values. 
A better approximation was found to be 


Jn / nu \»-i 
" (tt) ~(n— l)!(v(27r)) 


(n>2), 


which could further be improved to 


n(n— l)u*l 


(»$* 2). 


(24) 


(25) 


(26) 


2(»+ 1) 

A comparison of these three approximations with the exact values has been made for a 
few small values of u and with different values of n in Table 4. 

From (25) and (26) we can derive the following approximation for O r (x) when x is small: 


O r (x) 



(27) 



rx 2 

2(r+l)(r + 2) 


(28) 


3. Tables of the probability integral of the extreme deviate 

The starting-point was manuscript tables of the (? r -functions (r*2 8) prepared by 

Godwin (1945) and Hartley (1945) in the course of their work on the mean deviation. 
The functions they had actually tabulated were multiples of the G-functions, namely, 
Q*{x) - b T O r (x). Keeping n = r + 1 , the probability integral of u given in (20) may be written 

P r+1 (u) = c r G*[(r+l)u]. (29) 

Values of b T and c T are given below. 


r 

br 


2 

480 

25515518 x 10- w 

3 ^ 

480»ff-*2- 1 

61380797 x 10- ,s 

4 ~ 

480*ff-*2~» 

14297046 x 10"“ 

5 

480%-* 2“* 

65266786 x 10" 18 

6 

480*n~ s 2~* 

29368910 x 10-*° 

7 

480«ff-» 2-« 

13081952x10-** 

8 

480’v 4 2-* 

57814607 x 10-*» 




122 Distribution of the extreme deviate from the sample mean 

V 

The task was therefore to convert the G*-tables from regular arguments in x to regular 
arguments in u. In order to avoid excessive interpolation for x = (r + 1 ) u in the G*-tables it 
was decided to adopt a pivotal interval of 0*05 for u, which required selecting for x the pivotal 
intervals 0-15, 0*20, 0-25, 0-30, 0-35, 0-40 and 045 for r = 2, 3, ...,8 respectively. The 
x-interval in Godwin & Hartley’s tables was 0-05 for r — 2,3, 4; and 0*10 for r = 5, 0, 7, 8. 
At the intervals 0-35 and 0-45 for x in O e and O s , the half-way point formula of Lagrangian 
interpolation was applied on the 0-1 interval values of 0*(x) and Q*{x). Values of P n (u) for 
» = 3 to 9 were then obtained at intervals of 0-05 and then subtabulated on the National 
Accounting Machine at intervals of O01. These values correct to six decimal places are given 
in Table 1 at the end of the paper. All the calculations were undertaken by the National 
Physical Laboratory, Mathematical Division, under the direction of Dr Goodwin and 
Mr Vickers. 

Table 2, printed after Table 1 at the end of the paper, gives percentage points of the 
extreme deviate, at twelve different levels. These were calculated by interpolation from 
Table 1. 

We are now in a position to examine the adequacy of the approximations (23)-(26). 
Taking (23) first, the following Table 3 shows the exact and approximate values of the 
upper 5 and 1 % points. 

Table 3. Comparison of exact and approximate upper percentage points of u 


n 

5% 

i% 

Exact 

Approx. (23) 

Exact 

Approx. (23) 


3 

1-7375 

1-7376 

2-2152 

2-2162 

4 

1-9409 

1-9413 

2-4310 

2-4310 

5 

2-0801 

2-0807 

2-5743 

2-5743 

6 

2-1843 

2-1854 

2-6794 

2-6796 

7 

2-2667 

2-2683 

2-7613 

2-7616 

8 

2-3344 

2-3364 

2-8279 

2-8281 

9 

2-3916 

2-3940 

2-8837 

2-8839 


The agreement is very good. This high accuracy of his formula was apparently not known 
to McKay and has now been established by computation of the exact results. It may enable 
us to extend the range of the present Table 2 for obtaining upper percentage points of u for 
sample sizes beyond 9. 

Approximations (24), (25) and (26) were compared with the exact value of P n (u) for 
u = 0-05, 0-10 and 0-20 and n = 3, 4, 5. The results are given in Table 4 where (a), (6), (c) 
stand for the approximations (24), (25), (26) respectively and (d) stands'for the exact value, 
(a) and (6) are identical when n = 3, but it will be seen that when n exceeds 3, (6) gives much 
closer agreement than (a). 

With the help of the approximations (23) and (26) it may also be possible to extend Godwin 
& Hartley’s tables of upper and lower percentage points of the mean deviation for sample 
size beyond 10. 

To illustrate the use of Table 1 or 2 in testing the significance of an outlying observation 
an example is given below. 




K. R. Nair 


123 


Example. This is taken from McKay’s paper. In the course of routine testing of a standard 
leather product of a tannery, five parallel tests yielded the following values for the hide 
substance content of the leather specimens : 

32-44, 36-45, 39-64, 40-13, 41-09. 

The first observation appears unduly low. 

Long experience of the product in question has established a value of 2-2 for the standard 
deviation <r. We get u _ (37.95 -32-44)/2-2 = 2-50. 

Value of JP 6 (2*50) found from Table 1 is 0*987 031. The probability of getting a value 2*50 
or larger for u is thus about 0*013. If we use Table 2 only, we see that this probability is 
between 0*025 and 0*01. On the 5 % level of significance, we conclude that the observation 
32*44 is anomalous. 

Table 4. Comparison of approximate and exact values of P n (u) when u is small 


n 

Method 

u = 0-05 

o 

6 

1! 

3 

u = 0*20 

3 

(a) 

0-003 101 

0*012 405 

0*049 620 


(6) 

0003 101 

0*012 405 

0*049 620 


(e) 

0 003 095 

0-012 312 

0-048 131 


w 

0-003 095 

0*012 312 

0-048 166 

4 

(a) 

0000 127 

0-001 016 

0*008 127 


(l>) 

0-000 169 

0-001 355 

0-010 836 


(c) 

0-000 169 

0-001 338 

0-010 316 


(d) 

0000 169 

0-001 338 

0-010 334 

5 

(a) 

0-000 004 

0-000 071 

0-001 133 


(6) 

0-000 009 

0-000 148 

0-002 360 


(c) 

0-000 009 

0-000 145 

0-002 202 


(d) 

0-000 009 

0-000 145 

0-002 210 


(a) McKay’s approximation (24). (c) New approximation (26). 

(b) New approximation (25). (d) Exact value. 


4. The studentized integral of the extreme deviate 

We have seen that in order to apply the test for the extreme deviate x n — ir, it is necessary 
to know the standard deviation cr of the parent normal population. We shall now assume that 
an estimate s y of the unknown cr, equal to the square-root of a variance based on v degrees 
of freedom and independent of the sample (x v . . . , x n ) is available. 

x — ^ 

Let v P n (Q) denote the probability that u v = — — ^ Q . When v->co this tends to the 

_ s v 

probability P n (Q) that u = ^ Q given by (20). 

Using Hartley’s expansion for studentized integrals up to terms in v 2 we may write 



v P n (Q) = a o + ajv + ajv 2 , 

(30) 

where 

a 0 = Pn(Q)> 


and 

«i = l {Q*K(Q)-QKm> 

(31) 


= sW3W(G)- 20^(0)} -K 

(32) 



124 


Distribution of the extreme deviate from the sample mem 

Godwin & Hartley had prepared manuscript tables of 0*'(x), the first derivative of Gff(x). 
Values of P , r+ i(Q) could be calculated with the help of these tables, using the formula 

24 

^ + i«3) = F (r+l)c r G?'(x), (33) 

• 

where x — (r + 1) Q - nQ and W is the interval of x in 0*'(x). 

Tables of P' n {Q) for n =» 3 9 were prepared at the pivotal interval 0*05 for Q just in 

the same way as P n (u) was calculated from 0*(x), 

Values of the 2nd, 3rd and 4th derivatives of P n (Q) were obtained from P' n (Q) by numerical 
differentiation on the National Accounting Machine. Substituting these in (31) and (32), 
values of a x and a a were obtained at a pivotal interval of 0-05. Table 5, printed at the end of the 
paper, gives the values of o 0 , and a t at 0*20 interval for the argument Q. Values of yP n (Q) 
can easily be read off from this table using formula (30). For sake of convenience in making 
tests of significance, 5 and 1 % points have been calculated in Table 6, which is also repro- 
duced at the end of the paper. 

Since numerical differentiation had to be used in calculating even a x and a 4 , it was not 
feasible to bring in the next higher terms a s v ~ 3 and o 4 v~* for calculating y P n {Q) as that would 
have involved calculation of the 5th to 8th derivatives of P n (Q) by numerical differentiation. 

Values of a x are given in Table 5 to four decimals for all values of n except n — 3, for which 
five decimals have been supplied. The maximum error is not more than 2 in the last place. 

Values of o 2 are given to three decimals for all values of n except n — 9, for which only two 
decimals are retained, when Q exceeds 1*00. The maximum error is not more than 4 in the 
last place. 

This degree of error is inherent in the method of numerical differentiation which was used, 
but will not seriously affect values of v P n {Q) when v is moderately large. It is essential, 
however, to check against systematic errors and for this purpose the following checks 
proved quite useful: 

(i) When n = 3, it is comparatively easy to calculate the exact values of eq and a 8 directly 
from the expressions 

{3<2e-»«* - (3 Q* + 2) e-^j tQ e -fdt ) , 

= ^ < ? 8 { 3< 2( 189< ?*- 16 ) e_ao, -( 27< ? 4 - 42< 2*- 8 ) e " w, J ,<> «' <, ^)-K* 

In the table below a small panel of values has been calculated from the above exact 
formulae and compared with the values given in Table 5. The agreement is very satisfactory. 

Table 7 



<h 

a* 

Q 










Exact 

Approximate 

Exact 

■ 1 1 ■ 

Approximate 

1 

— 0*38720 

* -0*38720 

+ 0*275 

+ 0*276 

2 

-0*26540 

-0*26540 

-0*550 

-0*551 

3 

-0*01806 

-0*01865 

-0*324 

-0*323 

4 

-0*00022 

-0*00021 

-0*014 

-0*014 














K. R. Naib 125 

(ii) It is easy to see by partial integration that the expected value of (x n — 2)/<r or (x — x t )l<r 
can be expressed in terms of integrals of a 1 and o a over Q from 0 to oo. Thus, 


=- 3 J t ve 

32 f® 

' _ 26jo a,d<2, 

where P"„(Q) is given by (33). 

Since the expected values of x n — x and x—x 1 are equal, the expected value of Range/cr 
or (x n —Xj)l(r is twice that of the extreme deviate. Hence 


Mean range 8 f® „ 



By Gregory’s formula for numerical quadrature we can obtain values of the integrals 

ja t dQ, ja t dQ by simple summation. The value of the mean range for n = 3 to 9 thus 

obtained should agree reasonably with the exact values given in Table XXII of Tables for 
Statisticians and Biometricians, Part II. Table 8 below presents the values of the mean range 
obtained from a t and from o 2 , alongside the exact values. 


Table 8. Values of mean range (a = 1) 


n 

Using a x 

Using a t 

Exact value 

3 

1-09255 

1-091 

1-69257 

4 

2-0552 

2053 

205875 

5 

2-3230 

2-318 

2-32593 

6 

2-5361 

2-518 

2*53441 

7 

2-7081 

2-687 

2-70430 

8 

2-8504 

2-862 

2-84720 

9 

2-9705 

2*99 

2-97003 


Only as many deoimals have been retained in columns (2) and (3) of Table 8 as were avail- 
able for the values of a l and a a in Table 5. The agreement is satisfactory for our purpose. 

Two examples of the use of studentized probability integral of the extreme deviate are 
given below. 

Example 1. This is taken from Snedecor’s Statistical Methods (1946, 4th ed., p. 266). 

A randomized block experiment with four strains of wheat A, B,C,D and five replications 
gave the following mean yield in pounds per plot. 

A B C D 

34-4 34-8 83-7 28-4 











126 


Distribution of the extreme deviate from the sample mean 


An interesting feature is the similarity of the first three means. Analysis of varianoe 
showed significant differences among the four means, as Table 9 will show. 

The variance ratio F for strains against error is 44-82/2-19 = 20-5 which is much larger 
than i^'oo! = 10-8. As Snedecor says : ‘one suspects that the highly significant F is attribut- 
able largely to the small yield of D ’ . But according to him * no definite probability statements 
can be made about contrasts suggested by the data’. What he means perhaps is that a test 
of significance of the difference between 28-4 and the mean of 34-4, 34-8 and 33*7 by the 
usual {-test 

34-3-28-4 


with 12 degrees of freedom is not valid, as 28-4 is the smallest mean. The appropriate criterion 
in this situation is the studentized extreme deviate which in this case is 


x-x l _ 32-8-28-4 
a, _ V<*x2-19)~ ‘ ’ 

with n — 4 and v = 12. Referring to Table 5 (p. 142 below) we find that Q — 6-7 is far beyond 
the limits of the table showing that it is a very exceptional value. 


Table 9. Analysis of variance 


Source of 

Degrees of 

Sum of 

Variance 

variation 

freedom 

squares 

Blocks 

4 

21-46 

5-36 

Strains 

3 

134-45 

44-82 

Error 

12 

26-26 

2-19 

Total 

19 

18217 

— 


Having thus concluded that the smallest mean 28-4 is significantly smaller than the other 
three means, we are justified in saying that D is definitely inferior to A , B and C. Of course, 
as a general rule, no categorical conclusions can be drawn from the results of a single experi- 
ment and only by repeating the experiment a number of times and seeing whether D turns 
out to give the smallest mean every time can our conclusion from the first results be definitely 
established. 

Example 2. This is artificially built up from Example 1 by changing the error variance 
from 2- 1 9 to 1 3-00. The latter gives a standard error per plot of 1 1 % </hich is not too excessive 
a value to expect in ordinary field experiments. The variance ratio for strains against this 
error is F = 44-82/13-00 = 3-45 which is not significant at the 5 % level. The conclusion is 
therefore that there are no significant differences among the means of the four strains 
A, B, C, D. 

But if we compare the smallest mean against the general mean and calculate the student- 
ized extreme deviate 

x— x 1 _ 32-8 — 28-4 _ 

~*r ~ v(i xi 3 ; oo) ~ ■ * 




K. R. Naie 


127 


we find from Table 6 (see p. 143 below) that the probability of getting this or a larger value 
whenn = 4, v = 12 lies between 0-05 and 0-01. On the 5 % level, therefore, 28*4 is significantly 
smaller than the general mean, indicating that D is inferior to A, B and C. Although this is 
an artificially constructed example, it helps to illustrate a situation which may occasionally 
arise. 


6. Use of ©-functions in the distributions of a class of 

STATISTICS BASED ON ORDERED VARIATES 

Godwin introduced the ©-functions primarily to obtain the distribution of the mean 
deviation from the mean. We have seen its use in the distribution of the extreme deviate 
from the mean. We shall now consider certain other statistics based on the ordered variates 
x v ...,x n in whose sampling distributions the G-function appears. The distributions derived 
are in the form of single or double integrals involving the ©-functions in the integrand. 
Only by numerical integration using the ©-function tables can the probability integrals of 
these distributions be calculated. In many cases where only upper and lower percentage 
points are required, the approximations (27) and (28) to the ©-functions may be quite 
adequate. 

Let us consider a general statistic 

8 = (x n + . . . + x n _i +1 )fl -(*1 + ... + x k )jk, (34) 

where x l ^ ^ x n and k 4* l ^ n. 

When k = Z, 8 is same as the statistic suggested by Jones (1946) for a quick estimation 
of or in certain situations where (say) only 5 % of the observations at the two ends are avail- 
able. As special cases of this statistic we have the range and the mean deviation from the 
median (see, for example, Nair, 1947). 

When k + 1 = w, S may be written in three different forms : 

" ~ ( n ^ k ) (** + ••• +**h)-|(*i + •••+**) 

= i {(*» + • ■ ■ • + **u)/(» “*)-*} 

= (» ^T) {*“(**+ — + **)/*}• ( 36 ) 

When k = (n— 1) or 1, the 8 of (35) reduces to the extreme deviate x n — x or x—x v multi- 
plied by the factor n/(n— 1). 

To obtain the distribution of 8 of (34), we start off by introducing the following variates : 

_ 1 . 

* = -(*1+ •••+*»)» 
n 

(a? w + ...+s nW+1 )-Ka; 1 +...+s fc ) _ (x k+1 + ... + x H _ i ) 

( k+l ) n — k — l ^ 

u i = { ix i+i-i x i +*i)} (*= i, . ..,(£- 1)), 

»< = -(*<+*+ •••+**+!)} (»= l,...,(n-fc-l-l)),j 

M'i = ((** + ... + * w -, + i) -»*„-<} (* « 1)). 



128 Distribution of the extreme deviate from the sample! mean 

Dividing the n new variates, S, x, y , ««, v u w t by the square root of the sum of the ooeffioients 
of x on the right-hand side, we get the following orthogonal system of transformed variates 




kl 




(*+i) * 

<Jnx, 

( k + l ) ( n— k — l ) 

n 


i it 


H = 


Vi = 


VW<+1)] 

_ Ji 

S(i + 1)] 


It follows that 




(i= 1)), 


(» — 1 , (n — k—l — 1)), 


(i= 1 (i-1))- 


(37) 


n k-1 n-k-l-1 l-l 

2 (*?) = x' a +<S' 2 +y' 2 + 2 <*+ s < 2 + 2 < 2 

1 111 


- ^ ,-?L + -‘f >* 

(& + 0 n x i(t-fl) 1 


<(i+ l) + ?i^i)’ < 38 > 


» fc-l n-k-l-1 l-l 

Udx, = dx'dS'dy' JI J1 dvj II 

l ill 

fc-l n—k—l—l l-l 

dxdSdy n du i f[ dv i fj dw i 

i_ _ l l 

~ ~(k-l)\(l-l)\(n-k-l-l)\ “ ' 


(39) 


The joint distribution of x, S, y, u it v t , w i can be obtained from the joint distribution (10) 
of x v ...,x n in the form 


»1(27t)' 


in 


(Jfc — 1)1 (l— 1)! {n — k — l— 1)! 


x exp£— ijfix 2 




(t+i) 


fc-l y% 
2 .i v y 


1 i(i+l) 


n-fc-Z-l ^ 

+ ? <(7+7) + 


's'-sLVl 


fc-l n-fc-l-1 1-1 

x dxdddy Jl du { n dv i n dw { . 

ill 


(40) 


Integrating out for x between the limits ± oo, the joint distribution of S, y, u it v { , w i is 
n!»~*( 2ir)~ Wn_1) 

(& — 1)! (J— 1)! (» — &— i— 1)! 

: c cirn f If « j. , (* + l)(»-*-l) u\ -y-« ™ «j }1 

P l_ 2\& + Z n ^ + ?*(* + l) + ? i(i + 1) + ? *(* + 1)/J 

fc-l n-fc— I— 1 l-l 

x dSdy II d u i II n dw t . 

tii 


( 41 ) 



K. R. Naib 


129 


To integrate out for y, u { , v t and w i we have to note that corresponding to the inequalities 
*!<... we now have 

j, 


0 < < . . . < u\ 


i- 1> 


n-k-l - 1 


£«*-!+ 2 
1 1 




1$ 


i(i + l) (k + l) 
kS 


~7> 


(42) 


l'®'- 1 + (» - fc - 1 ) < (jfc + 1 ) 

It is extremely difficult to carry out the integration for y, u i3 v i and w i subject to the 
inequalities (42). In the following special cases, however, the last two inequalities in (42) 
take a simpler form and the integration can be carried forward to a certain stage in terms 
of G-functions. 

Case I. k + l = n. Here y and v i disappear and the inequalities (42) reduce to 

0 ... i, 


i+y-J 


i i 

1^5. 


(43) 


The joint distribution of S, u i and w i is 
n\n~ i (2n)~^ n ~ 1) 


(k— 1)! (1 — 1)! 


exp 


[-*{“*■♦?* - 


fc-i 

l i(i+l) 


i - 1 
+ V 

t *(» + 1) 




Dfc-i j-i 

dS n dUi n dw { . (44) 


n' n-H2n)^ n -^ T kl 1 f M 

/<*> - ^1(1-1)! eX P[-£^JJ„ ^-elkm. l{ e)de, (45) 


where 


Putting u k _ x = 6 and writing the last inequality in (43) as < 1(8 - ejk) we can integrate 
out u { and from (44) and obtain the distribution of 8 in the form 
n\n *(27 t) 

7*- 1)! (T 

“ l^ (e)= exp [-2*(Frij] <?fc - a(e)> 

If we had written mj,_ x = e and u k x < &(£ — e/1), we should get an alternate form which 

is the same as (46) with k and 1 interchanged. 

Putting 1=1, and S = nuj(n— 1) in (45) we get the distribution of the extreme deviate 
x n — » given in (19). 

Putting 1=2, which is appropriate for the case of two outlying observations in the same 
direction, the distribution of 6 takes the form 

m - 2( ” exp [ - a»] J Vs{(tt -m e*- 4 * dd. (46) 

If k = 1 = m and n = 2m, $ beoomes twice the mean deviation from the median, whose 
distribution has been obtained by Godwin (unpublished). 

Case II. k + l = n — 1. Here v ( ’s disappear and the inequalities (42) reduce to 


( 47 ) 


f w j 

\) 

U-i y j 

1’ 

( kS y 

| 

[n-l 7 j 

!-J 


Biometrilu 33 



m 


130 Distribution of the. extreme deviate from the sample mean 

The joint distribution of 8, y, %’s and w { ’s is 




exp 


~ If kl 
_ 2|(»- 


: 8 * + 


1) n 


£-+ E --*■ 
y + " »(» + !) ii(» + l)jj 


J1 


k - 1 (-1 

x dSdy n dU[ J) dv\, 

l l 


( 48 ) 


Integrating out y, % and w i from (48) using the inequalities (47) we get the distribution 
of 8 in the form 

^ - r )} + ^)} e!i p [ - ^ 1491 

Putting k = l = m and n = 2m 4* 1, 8 becomes n/m times the mean deviation from the 
median whose distribution has been obtained by Godwin (unpublished). If m = 1, in par- 
ticular, 8 becomes the range in a sample of 3 and we get 


f(8) = ^e wf* e-r^dy 
” J-H 

n Jo 


(50) 


which is the form given by McKay & Pearson (1933). 
Case III. k + l = n — 2. The inequalities become 


0 < «1 < - < «*- 1 < k (rjz ~2 -7- l v ij • 




(51) 


o ^ <•••<«’* i < * (~ 2 + y - J p t) • 


The distribution of 8 comes out in the form of a double integral involving product of two 
G-functions 

7l!ft~*(277) _i( ” 


TTTiu 0X P 


(k- 1)! (Z- 1)! 


*JT 


h&d 

K» -1 - f - ! ”')) ■> Ks?2 + ? ~ *”‘)) “t[ - fJ 4 ",r r l] driv ‘- 

(52) 


If n = 2m and k = / = m-l,8 becomes a special caso of Jones’s statistic. I n particular, if 
m = 2, this reduces to the range in samples of 4 and its distribution becomes 


7T^jn Jo 


e-lrij e~ ix *dxdv v 




(53) 


I should like to acknowledge warmly the help and guidance I have received from Prof. 
Pearson and Dr Hartley in the course of this investigation. 

My thanks are also due to Dr E. T. Goodwin and Mr T. Vickers of the Mathematics Division 
of the National Physical Laboratory for their expert help in carrying out the calculations 
required for Tables 1 and 5. 










Table 1 (cont.). Probability integral of the extreme deviate (x n —S)/cr or 

in * normal ’ samples of size n 


m 

3 

4 

5 

6 

7 

8 

9 


io-« 

10 ~* 

io-« 

10 -4 

io-* 

io-» 

io-* 

0-50 

259 684 

128 073 

62 880 

30 80S 

15 073 

7 370 

3 602 

0-51 

268 351 

134 495 

67 103 

33 407 

16 611 

8 254 

* 4 099 

0*52 

277 066 

141 054 

71 487 

36 151 

18 269 

9 216 

4 649 

0*53 

285 823 

147 748 

76 029 

39 039 

20 021 

10 260 

5 256 

0*54 

294 616 

154 572 

80 730 

42 073 

21 899 

11 391 

5 922 

0-55 

303 442 

161 522 

85 589 

45 255 

23 899 

12 612 

6 652 

0-56 

312 295 

108 594 

90 604 

48 587 

26 023 

13 928 

7 450 

0-57 

321 170 

175 782 

95 774 

52 069 

28 275 

15 343 

8 321 

0-58 

330 063 

183 084 

101 097 

55 705 

30 657 

16 860 

9 267 

0-59 

338 969 

190 494 

106 572 

59 493 

33 173 

18 484 

10 294 

0-60 

347 883 

198 008 

112 195 

63 436 

35 825 

20 218 

11 404 

0-61 

356 801 

205 621 

117 965 

67 534 

38 616 

22 066 

12 602 

0-62 

365 720 

213 328 

123 879 

71 786 

41 649 

24 032 

13 893 

0*63 

374 633 

221 124 

129 935 

76 193 

44 625 

26 118 

15 279 

0-64 

383 538 

229 006 

136 129 

80 754 

47 846 

28 329 

16 766 

0*65 

392 430 

236 967 

142 459 

85 468 

51 214 

30 668 

18 356 

0*66 

401 305 

245 003 

148 921 

90 334 

54 730 

33 137 

20 054 

0-67 

410 159 

253 110 

155 511 

95 352 

58 396 

35 740 

21 863 

0*68 

418 989 

261 283 

162 227 

100 521 

62 213 

38 478 

23 787 

0-69 

427 791 

269 516 

169 064 

105 838 

66 180 

41 355 

25 830 

0*70 

436 562 

277 805 

176 019 

111 302 

70 300 

44 373 

27 995 

0-71 

445 298 

286 145 

183 088 

116 912 

74 572 

47 534 

30 286 

0-72 

453 996 

294 532 

190 266 

122 666 

78 996 

50 839 

32 703 

0-73 

462 652 

302 960 

197 550 

128 560 

83 571 

54 290 

35 253 

0-74 

471 264 

311 425 

204 935 

134 593 

88 298 

57 889 

37 936 

0-75 

479 829 

319 923 

212 417 

140 761 

93 176 

61 637 

40 756 

0-76 

488 344 

328 448 

219 992 

147 062 

98 203 

65 535 

43 716 

0*77 

496 805 

336 997 

227 655 

153 493 

103 380 

69 583 

46 814 

0-78 

505 211 

345 564 

235 402 

160 050 

108 703 

73 781 

50 057 

0*79 

513 559 

354 147 

243 228 

166 731 

114 172 

78 131 

! 53 444 

0-80 

521 847 

362 739 

251 129 

173 531 

119 785 

82 632 

56 978 

0-81 

530 072 

371 337 

259 100 

180 447 

125 540 

87 284 

60 660 

0-82 

538 232 

379 938 

267 137 

187 476 

431 434 

92 086 

64 490 

0-83 

546 325 

388 536 

275 235 

194 613 

137 466 

97 038 

68 471 

0-84 

554 349 

397 129 

j 283 390 

201 855 

143 632 

102 138 

72 601 

0*85 

562 303 

405 711 

291 597 

209 197 

149 930 

107 386 

76 883 

0*86 

570 184 

414 280 

299 851 

216 636 

156 357 

112 780 

81 315 

0-87 

577 991 

422 831 

308 149 

224 167 

162 910 

118 319 

85 899 

0*88 

585 722 

431 362 

316 484 

231 786 

169 585 

124 000 

90 633 

0*89 

593 376 

439 868 

324 854 

239 488 

176 380 

129 822 

95 517 

0-90 

600 951 

448 346 

333 254 

247 270 

183 290 

13l> 783 

100 550 

0*91 

608 446 

456 793 

341 679 

255 127 

190 313 

141 880 

105 732 

0-92 

615 860 

465 206 

350 125 

263 054 

197 444 

148 111 

111 061 

0-93 

623 192 

473 581 

358 588 

271 047 

204 680 

154 472 

116 536 

0-94 

630 441 

481 917 

367 063 

279 102 

212 016 

160 961 

122 154 

0-95 

637 606 

490 210 

375 547 

287 214 

219 449 

167 575 

127 913 

0-96 

644 686 

498 457 

384 035 

295 379 

226 974 

174 310 

133 811 

0-97 

651 680 

506 656 

392 523 

303 591 

234 587 

181 164 

139 848 

0-98 

658 588 

514 803 

401 008 

311 848 

242 285 

188 132 

146 021 

0-99 

665 408 

522 895 

409 486 

320 143 

250 062 

195 211 

152 329 

1-00 

- 

672 142 

530 930 

417 952 

* 

328 474 

257 914 

i 

202 397 

158 771 









Table 1 (oont.). Probability integral of the extreme deviate {x n 

in 1 normal 9 samples of size n 


—x)/(r or {x — : 



328 474 
336 835 
345 222 
353 632 
362 059 


387 405 
395 861 
404 316 

412 764 
421 202 
429 626 
438 032 
446 418 

454 780 
463 114 
471 417 


536 375 
544 271 
552 109 
559 887 
567 603 

575 255 
582 841 
590 359 
597 808 
605 185 

612 491 




257 914 
265 838 
273 827 
281 879 
289 989 

298 151 
306 362 
314 618 
322 913 
331 244 

339 606 
347 995 
356 406 
364 835 
373 278 

381 732 
390 191 
398 652 
407 110 
415 563 

424 006 
432 435 
440 847 
449 239 
457 606 


474 255 
482 530 
490 767 
498 965 

507 120 
515 229 
523 290 
531 300 


202 397 
209 687 
217 076 
224 561 
232 136 

239 799 
247 544 
255 368 
263 266 
271 234 

279 267 
287 360 
295 510 


446 849 
455 252 
463 629 
471 978 




192 798 
199 947 
207 203 
214 561 
222 018 

229 568 
237 209 
244 935 
252 743 
260 627 
















Table 1 (cont.). Probability integral of the extreme deviate (x n — x)/a or (x — x t ) jo- 
in ‘normal’ samples by size n 


\ n 

U ' 

3 

4 

5 

6 

7 

8 

9 


io-® 

10"* 

10~® 

10-« 

io-« 

io-« 

lO-o 

ISO 

900 750 

834 724 

771 701 

712 611 

657 619 

606 627 

559 440 

1*51 

903 428 

838 710 

776 792 

718 621 

664 377 

613 983 

567 262 

1*52 

906 045 

842 619 

781 800 

724 545 

671 053 

621 265 

575 021 

1*53 

908 604 

846 463 

786 724 

730 383 

677 647 

628 471 

582 713 

1*54 

911 105 

850 214 

791 565 

736 136 

684 156 

635 600 

590 337 

1*55 

913 550 

853 901 

796 324 

741 803 

690 582 

642 650 

597 892 

1*56 

915 938 

857 515 

801 000 

747 384 

696 923 

649 620 

605 374 

1*57 

918 271 

861 058 

805 595 

752 879 

703 179 

050 509 

612 783 

1*58 

920 550 

864 531 

810 109 

758 289 

709 349 

663 317 

620 117 

1*59 

922 775 

867 933 

814 543 

763 613 

715 433 

670 042 

627 375 

1*60 

924 949 

871 266 

818 897 

768 852 

721 431 

676 684 

634 555 

1*61 

927 071 

874 531 

823 172 

774 007 

727 343 

683 241 

641 656 

1*62 

929 142 

877 728 

827 368 

779 076 

733 169 

689 714 

648 677 

1*63 

931 164 

880 859 

831 487 

784 062 

738 908 

696 101 

655 616 

1*64 

933 137 

883 924 

835 529 

788 964 

744 560 

702 402 

662 473 

1*65 

935 062 

886 925 

839 494 

793 783 

750 126 

708 617 

669 246 

1*66 

936 940 

889 862 

843 384 

798 518 

755 605 

714 745 

075 935 

1*67 

938 772 

892 737 

847 199 

803 172 

760 999 

720 786 

082 540 

1*68 

940 559 

895 549 

850 941 

807 743 

766 306 

726 740 

689 058 

1 69 

942 301 

898 300 

854 609 

812 233 

771 527 

732 607 

695 491 

1*70 

944 000 

900 991 

858 205 

816 643 

776 663 

738 387 

701 837 

1*71 

945 656 

903 623 

861 729 

820 973 

781 714 

744 078 

708 095 

1*72 

947 270 

906 197 

865 182 

825 223 

786 680 

749 083 

714 206 

1*73 

948 843 

908 713 

868 566 

829 395 

791 561 

755 200 

720 349 

1*74 

950 376 

911 172 

871 881 

833 488 

796 359 

760 630 * 

726 344 

1*75 

951 870 

913 577 

875 128 

837 505 

801 073 

765 973 

732 251 

1 76 

953 324 

915 926 

878 308 

841 445 

805 704 

771 229 

738 069 

1*77 

954 741 

918 222 

881 421 

845 309 

810 253 

776 398 

743 799 

1*78 

956 121 

920 465 

884 469 

849 099 

814 720 

781 482 

749 440 

1*79 

957 464 

922 656 

887 453 

852 814 

819 106 

786 479 

754 993 

1*80 

958 772 

924 795 

890 373 

856 456 

823 411 

791 391 

760 458 

1*81 

960 045 

926 885 

893 230 

860 025 

827 637 

796 219 

765 834 

1 82 

961 284 

928 925 

896 026 

863 523 

831 783 

800 961 

771 123 

1*83 

962 489 

930 917 

898 760 

866 950 

835 851 

805 020 

776 324 

1*84 

963 662 

932 861 

901 435 

870 308 

839 842 

810 196 

781 438 

1*85 

964 803 

934 759 

904 051 

873 596 

843 756 

814 689 

786 466 

1*86 

965 913 

936 610 

906 608 

876 816 

847 593 

819 099 

791 407 

1*87 

966 992 

938 417 

909 109 

879 969 

851 356 

823 429 

796 261 

1*88 

968 042 

940 180 

911 553 

883 056 

855 044 

827 677 

801 031 

1*89 

969 062 

941 899 

913 942 

886 077 

858 658 

831 846 

805 715 

1*90 

970 054 

943 576 

916 276 

889 034 

862 200 

835 935 

810 315 

1-91 

971 018 

945 211 

918 557 

891 927 

865 670 

839 945 

814 832 

1*92 

971 954 

946 805 

920 785 

894 757 

869 068 

843 878 

819 265 

1*93 

972 864 

948 359 

922 962 

897 526 

872 397 

847 734 

823 616 

1*94 

973 748 

949 874 

925 087 

900 234 

875 657 

851 514 

827 885 

1*95 

974 607 

951 350 

927 163 

902 882 

878 848 

855 219 

832 073 

1*96 

975 441 

952 789 

929 189 

905 471 

881 972 

858 849 

836 181 

1*97 

976 251 

954 190 

931 168 

908 002 

885 029 

862 405 

840 209 

1*98 

977 037 

955 556 

933 099 

910 476 

888 021 

865 889 

844 159 

1*99 

977 800 

956 886 

934 983 

912 894 

890 949 

869 301 

848 031 

2*00 

978 541 

958 181 

936 822 

916 257 

893 813 

872 642 

851 826 



Table 1 (cont.). Probability integral of the extreme deviate (x n — x) /cr or (x — x t )/cr 

in ‘ normal ’ samples of size n 


n 



2*00 

2*01 

2-02 

2*03 

2*04 

2*05 

2*06 

2-07 

2-08 

2-09 

2 * 10 
2*11 
2-12 
2*13 
2-14 

2 15 
2-16 
2-17 
2-18 
2-19 

2-20 
2*21 
2 22 
2 23 
2-24 

2-25 

2-26 

2-27 

2-28 

2*29 

2-30 

2-31 

2-32 

2-33 

2*34 

2-35 

2-36 

2-37 

2-38 

2-39 

2*40 

2-41 

2-42 

2-43 

2*44 

2*45 

2-46 

2*47 

2-48 

2*49 

2*50 


10 “ 


io- # 


3 


io-« 

978 541 

979 260 

979 958 

980 635 

981 291 

981 928 

982 54 5 

983 144 

983 724 

984 286 

984 832 

985 360 

985 872 

986 367 

986 847 

987 312 

987 763 

988 198 

988 620 

989 029 

989 424 

989 806 

990 176 
990 534 

990 880 

991 214 
991 538 

991 851 

992 153 
992 445 

992 727 

992 999 

993 263 
993 517 

993 763 

994 000 
994 229 
994 450 
994 603 

994 869 

995 067 
995 259 
995 443 
995 621 
995 793 

995 959 

996 118 
996 272 
996 420 
996 563 

996 701 


4 


10 “* 

958 181 

959 442 

960 670 

961 8 66 

963 030 

964 162 

965 265 

966 337 

967 380 

968 395 

969 382 

970 342 

971 275 

972 182 

973 064 

973 921 

974 754 

975 563 

976 349 

977 113 

977 855 

978 575 

979 275 

979 954 

980 613 

981 253 

981 874 

982 476 

983 061 

983 628 

984 178 

984 711 

985 228 

985 730 

986 216 

986 687 

987 144 

987 587 

988 015 
988 431 

988 833 

989 223 
989 600 

989 960 

990 320 

990 662 

990 993 

991 314 
991 625 

991 925 

992 215 


5 


10~ 6 

936 822 
938 616 
940 366 

942 073 

943 738 

945 362 

946 945 

948 488 

949 992 

951 458 

952 886 

954 278 

955 633 

956 954 

958 240 

959 493 

960 712 

961 899 

963 055 

964 180 

965 274 

966 339 

967 375 

968 383 

969 364 

970 317 

971 244 

972 145 

973 022 

973 873 

974 701 

975 505 

976 287 

977 046 

977 784 

978 500 

979 196 

979 871 

980 527 

981 164 

981 782 

982 381 

982 963 

983 528 

984 075 

984 607 

985 122 

985 622 

986 106 

986 576 

987 031 


6 


10 -® 

915 257 
917 566 
919 821 
922 023 
924 174 

926 274 
928 325 
930 326 
932 280 
934 187 

936 047 

937 862 
939 632 
941 359 

943 043 

944 685 

946 286 

947 846 

949 367 

950 850 

952 294 

953 701 

955 072 

956 407 

957 708 

958 974 

960 207 

961 407 

962 575 

963 712 

964 819 

965 896 

966 943 

967 962 

968 953 

969 916 

970 853 

971 765 

972 650 

973 511 

974 348 

975 161 

975 951 

976 718 

977 464 

978 188 

978 891 

979 574 

980 237 

980 881 

981 505 


893 813 
896 614 
899 353 
902 032 
904 650 

907 210 
909 712 
912 157 
914 545 
916 879 

919 158 
921 384 
923 558 
925 680 
927 752 

929 775 
931 748 
933 674 
935 553 
937 386 

939 173 

940 917 
942 617 

944 274 

945 890 

947 465 

949 000 

950 496 

951 953 

953 373 

954 756 

956 102 

957 414 

958 691 

959 934 

961 144 

962 322 

963 468 

964 583 

965 668 

966 723 

967 750 

968 748 

969 719 

970 663 

971 581 

972 473 

973 340 

974 183 

975 001 

975 797 


872 642 
875 913 
879 116 
882 250 
885 317 

888 318 
891 253 
894 124 
896 932 
899 678 

902 362 
904 986 
907 550 
910 056 
912 504 

914 896 
917 232 
919 514 
921 741 
923 916 

926 040 
928 112 
930 134 
932 107 

934 032 

935 910 
937 742 
939 528 

941 270 

942 968 

944 623 

946 236 

947 808 

949 340 

950 832 

952 286 

953 702 

955 081 

956 424 

957 732 

959 005 

960 243 

961 449 

962 622 

963 764 

964 875 

965 955 

967 006 

968 028 

969 021 

969 987 


851 826 
855 544 
859 188 
862 757 
866 252 

869 675 
873 027 
876 307 
879 518 
882 661 

885 735 
888 742 
891 684 
894 561 
897 373 

900 123 
902 811 
905 438 
908 005 
910 513 

912 963 
915 355 
917 692 
919 974 
922 201 

924 375 
926 498 
928 568 
930 589 
932 560 

934 483 
936 358 

938 187 

939 970 
941 708 

943 402 

945 053 

946 662 

948 230 

949 758 

951 245 

952 694 

954 105 

955 479 

956 817 

958 119 

959 386 

960 620 

961 820 

962 987 

964 123 




Table 1 (cont.). Probability integral of the extreme deviate (x n —l i)/or or fa- 
in ‘ normal * samples of size n 











Table 1 (cont.). Probability integral of the extreme deviate (x n 

in ‘ normal 9 samples of size n 


—x)lcr or (x—x 1 )lo‘ 





909 642 
999 669 
999 676 
999 690 
999 706 


998 936 

998 981 

999 024 
999 066 
999 106 


998 010 
998 088 
998 164 
998 238 
998 308 


995 823 

995 978 

996 128 
996 273 
996 413 


994 639 

994 835 

995 024 
995 207 
995 383 


993 421 
993 658 

993 887 

994 108 
994 322 


999 719 
999 732 
999 745 
999 767 
999 769 


999 143 
999 179 
999 215 
999 248 
999 281 


998 376 
998 441 
998 504 
998 565 
998 623 


996 548 
996 678 
996 804 

996 925 

997 043 


995 554 
995 718 

995 877 

996 031 
996 179 


994 528 
994 728 

994 921 

995 107 
995 287 


999 312 
999 341 
999 370 
999 397 
999 424 


998 679 
998 733 
998 785 
998 834 
998 882 


997 155 
997 264 
997 369 
997 471 
997 568 


996 322 
996 460 
996 594 
996 722 
996 846 


995 461 
995 629 
995 792 

995 948 

996 099 


999 449 
999 473 
999 496 
999 519 
999 540 


998 929 

998 973 

999 015 
999 056 
999 096 


997 662 
997 753 
997 840 

997 925 

998 006 


996 966 

997 082 
997 193 
997 301 
997 404 


996 245 
996 386 
996 522 
996 653 
996 780 


999 560 
999 580 
999 599 
999 617 


999 134 
999 170 
999 205 
999 238 
999 270 


998 084 
998 159 
998 232 
998 302 
998 369 


997 504 
997 600 
997 693 
997 783 
997 869 


996 902 

997 020 
997 133 
997 243 
997 349 


999 301 
999 331 
999 359 
999 387 
999 413 


998 434 
998 497 
998 557 
998 615 
998 670 


997 952 

998 032 
998 109 
998 184 
998 255 


997 451 
997 549 
997 644 
997 735 
997 823 


999 438 
999 462 
999 486 
999 508 
999 529 


998 724 
998 775 
998 825 
998 873 
998 919 


998 324 
998 391 
998 455 
998 516 
998 575 


997 908 

997 989 

998 068 
998 144 
998 217 


998 963 

999 005 
999 046 
999 085 
999 123 


998 633 
998 687 
998 740 
998 791 
998 840 


998 287 
998 355 
998 420 
998 483 
998 544 


999 159 
999 194 
999 22 8 
999 260 
999 291 


998 887 
998 932 

998 976 

999 018 
999 058 


998 602 
998 658 
998 712 
998 764 
998 813 


999 320 
999 349 
999 376 
999 403 
999 428 


999 097 
999 134 
999 169 
999 204 
999 237 


998 862 
998 908 
998 952 

998 995 

999 036 


999 452 


999 269 


999 076 










Table 1 (cont.). Probability integral of the extreme deviate (x n —x)/or or (x—Xj)/cr 

in ‘ normal ’ samples of size n 


v n 

u 

3 

4 

5 

6 

7 

8 

9 


10- 4 

10- 6 

10- 4 

10“* 

10- 4 

10- # 

io-« 

3-50 

999 973 

999 894 

999 772 

999 622 

999 452 

999 269 

999 076 

3*51 

999 974 

999 899 

999 783 

999 638 

999 476 

999 299 

999 114 

3 52 

999 976 

990 904 

999 792 

999 654 

999 498 

999 328 

999 150 

3*53 

999 977 

999 908 

999 802 

999 669 

999 519 

999 357 

999 185 

3*54 

999 978 

999 913 

999 811 

999 684 

999 540 

999 384 

999 219 

3*55 

999 979 

999 917 

999 820 

999 698 

999 560 

999 410 

999 251 

3-56 

999 981 

999 921 

999 828 

999 711 

999 579 

999 435 

999 28 2 

3*57 

999 982 

999 925 

999 836 . 

999 724 

999 597 

999 458 

999 312 

3*58 

999 983 

999 929 

999 843 

999 736 

999 614 

999 481 

999 341 

3*59 

999 984 

999 932 

999 851 

999 748 

999 631 

999 503 

999 369 

3*60 

999 985 

999 935 

999 858 

999 759 

999 647 

999 525 

999 395 

3*61 

999 985 

999 939 

999 864 

999 770 

999 662 

999 545 

999 421 

3*62 

999 986 

999 942 

999 870 

999 780 

999 677 

999 564 

999 445 

3 63 

999 987 

999 945 

999 876 

999 790 

999 691 

999 583 

009 469 

3*64 

999 988 

999 947 

999 882 

999 800 

999 705 

999 601 

999 491 

3 65 

999 988 

999 950 

999 888 

999 809 

999 718 

999 618 

999 513 

3*66 

999 989 

999 952 

999 893 

999 817 

999 730 

999 035 

999 533 

3*67 

999 990 

999 955 

999 898 

999 826 

999 742 

999 651 

999 553 

3*68 

999 990 

999 957 

999 903 

999 834 

999 754 

999 666 

999 573 

3*69 

999 991 

999 959 

999 908 

999 841 

999 765 

999 680 

999 591 


999 99t 

999 961 

999 912 

999 848 

999 775 

000 094 

999 609 

3*71 

999 992 

999 963 

999 916 

999 855 

999 785 

999 708 

999 625 

3 72 

999 992 

999 965 

999 920 

999 862 

999 795 

999 720 

999 642 

3-73 

999 993 

999 967 

999 924 

999 868 

999 804 

999 733 

999 657 

3*74 

999 993 

999 968 

999 928 

999 874 

999 813 

999 745 

999 672 

3*75 

999 994 

999 970 

999 931 

999 880 

999 821 

999 756 

999 686 

3-76 

999 994 

999 972 

999 934 

999 886 

999 829 

999 767 

999 700 

3-77 

999 994 

999 973 

999* 938 

999 891 

999 837 

999 777 

999 713 

3*78 

999 995 

999 974 

999 941 

999 896 

999 844 

999 787 

999 726 

3*79 

999 995 

999 976 

999 943 

999 901 

999 852 

999 796 

999 738 

3*80 

999 995 

999 977 

999 946 

999 906 

999 858 

999 806 

999 749 

3*81 

999 995 

999 978 

999 949 

999 910 

999 865 

999 814 

999 760 

3*82 

999 996 

999 979 

999 951 

999 914 

999 871 

999 823 

999 771 

3*83 

999 996 

999 980 

999 954 

999 918 

999 877 

999 831 

999 781 

3*84 

999 996 

999 981 

999 956 

999 922 

999 883 

999 838 

999 791 

3*85 

999 996 

999 982 

999 958 

999 926 

999 888 

999 846 

999 800 

3 86 

999 997 

999 983 

999 960 

999 929 

999 893 

999 852 

999 809 

3*87 

999 997 

999 984 

999 962 

999 933 

999 898 

999 859 

999 817 

3-88 

999 997 

999 985 

999 964 

999 936 

999 903 

999 866 

999 826 

3*89 

999 997 

999 986 

999 966 

999 939 

999 907 

999 872 

999 834 

3*90 

999 997 

999 987 

999 968 

999 942 

999 912 

999 878 

999 841 

3*91 

999 997 

999 987 

999 969 

999 945 

999 916 

999 883 

999 848 

3*92 

999 998 

999 988 

999 971 

999 947 

999 920 

999 889 

999 855 

3*93 

999 998 

999 989 

999 972 

999 950 

999 924 

999 894 

999 862 

3*94 

999 998 

999 989 

999 974 

999 952 

999 927 

999 899 

999 868 

3*95 

999 998 

999 990 

999 975 

999 955 

999 931 

999 903 

999 874 

3*96 

999 998 

999 990 

999 976 

999 957 

999 934 

999 908 

999 880 

3*97 

999 998 

999 991 

999 977 

999 959 

999 937 

999 912 

999 885 

3*98 

999 998 

999 991 

999 979 

999 961 

999 940 

999 916 

999 890 

3*99 

999 998 

999 992 

999 980 

999 963 

999 943 

999 920 

999 895 

4*00 

999 999 

999 992 

999 981 

999 965 

999 946 

999 924 

999 900 





Table 1 (cont.). Probability integral of the extreme deviate (x n — x) /or or (x — x 1 )/o t 

in ‘ normal ’ samples of size n 



999 999 
999 999 
999 999 
999 999 
999 999 


999 992 
999 993 
999 993 
999 993 
999 994 


999 981 
999 982 
999 983 
999 983 
999 984 


999 965 
999 966 
999 968 
999 970 
999 971 


999 946 
999 948 
999 951 
999 953 
999 955 


999 924 
999 927 
999 931 
999 934 
999 937 


999 900 
999 905 
999 909 
999 913 
999 917 


999 999 
999 999 
999 999 
999 999 
999 999 


999 994 
999 994 
999 995 
999 995 
999 995 


999 985 
999 986 
999 987 
999 987 
999 988 


999 973 
999 974 
999 975 
999 976 
999 978 


999 958 
999 960 
999 962 
999 964 
999 965 


999 940 
999 943 
999 946 
999 948 
999 951 


999 921 
999 925 
999 928 
999 932 
999 935 


999 999 
999 999 
999 999 
999 999 
999 999 


999 996 
999 996 
999 996 
999 996 
999 996 


999 989 
999 989 
999 990 
999 990 
999 991 


999 979 
999 980 
999 981 
999 982 
999 983 


999 967 
999 969 
999 970 
999 972 
999 973 


999 953 
999 955 
999 957 
999 959 
999 961 


999 938 
999 941 
999 944 
999 946 
999 949 


1 000 000 


999 997 
999 997 
999 997 
999 997 
999 997 


999 991 
999 992 
999 992 
999 993 
999 993 


999 984 
999 984 
999 985 
999 986 
999 987 


999 975 
999 976 
999 977 
999 978 
999 979 


999 963 
999 965 
999 967 
999 968 
999 970 


999 951 
999 954 
999 956 
999 958 
999 960 


999 997 
999 998 
999 998 
999 998 
999 998 


999 993 
999 994 
999 994 
999 994 
999 995 


999 987 
999 988 
999 989 
999 989 
999 990 


999 980 
999 981 
999 982 
999 983 
999 984 


999 971 
999 973 
999 974 
999 975 
999 977 


999 962 
999 964 
999 965 
999 967 
999 969 


999 998 
999 998 
999 998 
999 998 
999 998 


999 995 
999 995 
999 996 
999 996 
999 996 


999 990 
999 991 
999 991 
999 992 
999 992 


999 985 
999 986 
999 986 
999 987 
999 988 


999 978 
999 979 
999 980 
999 981 
999 982 


999 970 
999 971 
999 973 
999 974 
999 975 


999 999 
999 999 
999 999 
999 999 
999 999 


999 996 
999 996 
999 997 
999 997 
999 997 


999 993 
999 993 
999 993 
999 994 
999 994 


999 988 
999 989 
999 990 
999 990 
999 991 


999 983 
999 983 
999 984 
999 985 
999 986 


999 977 
999 978 
999 979 
999 980 
999 981 


999 999 
999 999 
999 999 
999 999 
999 999 


999 997 
999 997 
999 997 
999 998 
999 998 


999 994 
999 995 
999 995 
999 995 
999 995 


999 991 
999 992 
999 992 
999 992 
999 993 


999 987 
999 987 
999 988 
999 988 
999 989 


999 982 
999 983 
999 983 
999 984 
999 985 


999 999 
999 999 
999 999 
999 999 
999 999 


999 998 
999 998 
999 998 
999 998 
999 998 


999 996 
999 996 
999 996 
999 996 
999 996 


999 993 
999 994 
999 994 
999 994 
999 995 


999 990 
999 990 
999 991 
999 991 
999 992 


999 986 
999 986 
999 987 
999 988 
999 988 


999 999 
999 999 
999 999 
999 999 
999 999 


999 998 
999 998 
999 b99 
999 999 
999 999 


999 997 
999 997 
999 997 
999 997 
999 997 


999 995 
999 995 
999 996 
999 996 
999 996 


999 992 
999 992 
999 993 
999 993 
999 993 


999 989 
999 989 
999 990 
999 991 
999 991 


999 999 


999 999 


999 998 


999 996 


999 994 


999 992 




Table 1 (cont.). Probability integral of the extreme deviate (x n —x)f<r or (x—x l )j<r 

in ‘ normal * samples of size n 













Table 5. Auxiliary quantities required for calculating the probability integral of the studentized extreme deviate (x n —x)js p or (x—x^js. 


K. E>. Naib 


ui 


§ 8 5 2 3 
6 6 6 6 6 


;ss: 


§8358 

6 6 6 A 6 


© © © © © © 
© « ** © oo © 

• * • • * • 


to CO © 

H« PH ^ 

I I © pH pH 

II 666 

I I I 


nneontt 

co oo © ae 04 

©4 r* 

© 6 *h 6 © 

+ + + + + 


t- © to IO 

CO ^ *-* *o 

««fHCOW 

o6hhh 

I I I I I 


04 CO ' 


_ _ j da © ^ co 

© O0 CO 04 pH 

hoo66 © 
I I I I I I 


*h oo oo oo 

S ®4 t- 
© © © 
666 © 
+ + + + 


00 C© ^ © 04 

o* o> a> 

co © co © ^ 

<H CO © 

6 6 6 6 6 

I I I I I 


© oo r- ©4 © 
© » © 00 ©4 
»0 04 00 C© t> 
©©C 0 «m 
66666 
t I I I I 


© © © © ph 

§S8f&| 

© © «h © 

© © © © »H 
6 6 6 6 6 


<*< o 04 © 
ph ao *o *o 
oo ao 

CO © t- <30 © 
04 © ^ © © 
© ^ © f* OO 
6 6 6 6 6 


y* © © © 
© © n# pH ©4 
04 04 CO © © 
© © ■*« © © 
© as 

© © © © © 

6 6 6 6 6 


© © © 3* © © 

p© © o* 6 r* © 

2§§S8 8 
66666 6 
I I I I I I 


© 04 © © © © 


© © « 

© © r» w 

© © © © © © 

© © © © © © 

© © © © © © 

66666 6 


© 


*-4 « © © 

S 04 t> © 
© © © 
6 6 6 6 
111 + 


pH pH © © © 

© © © © © 

© t"~ t"* © © 

6 6 6 6 6 

+ + + + + 


-3 © © pH © 

6 6 ^ h h 

I I I I I 


© © © © © Q 
© 04 © r- co © 

© © © ph © 

66666 6 
I I I I I I 


£ g j 
© © < 


© 

§©<_ _ 

6666 


© © © © © 
00HI^©t^ 

^ 04 © 

04 ^ © © © 

6 6 6 6 6 

I I I 1 I 


_ © 04 © 

> © © © © 
i © © 04 


3! 

© • 

© "© 04 04 *H 

6 6 6 6 6 

I I I I I 


© © © © © 
©©•—(© ^ 
©4 ph © 
© © © © © 
6 6 6 6 6 
! I I I I 


© © © © © 
© pH p«f © 04 
© 04 © «H pH 
© 04 © 04 pm 
© © 04 pH © 
© © © *H 04 

6 6 6 6 6 


04 O t— t- CO 

04^04IOC- 

© ^ © 00 CO 

»H 

U) h H 04 b» 
OOHOOW 

I- O co 00 o 
hoOhhO 
^ Ip t- OP OO 

04 r- oo co 

OO 04 00 © 

CO IO PH O IO 
CO CO X O 04 

PH CO ^ IO ^ 
© PH © OO © 

00 © © © © 
© © Cb O 04 
© © © © © 

© 

© © © © © 

© 

6 6 6 6 6 

6 6 6 6 6 

6 6 6 6 6 

6 


© 04 ^ © 
§ © § 2 
6 6 6 6 
I t+ + 


04 ^ © CO © 
© 04 © m 
^«5 4<»hC4 
66666 
+ + + + I 


© r- ph © 04 

S co © © © 

t^> ao ao r- 
6 6 6 6 6 
I I I t I 


b* t— >H t-4 »H © 

© pH l> © © 'tO 

© ^ 04 © © 


© © © © © 
fill) 


© © © 04 
© © 04 

, © pH 04 CO 

© © © m 

'6666 
+ + I I 


© © © © © 
© © ■*< © pH 
^ © © © 
© *o © © 

6 6 6 6 6 

I I I I I 


© 04 © *■£ 

© © © 04 © 
©©©©t- 
©4 ©4 pH © 

6 6 6 6 6 

I I I I l 


© ^ © © © 
© © r- © © 
© c© ph © r- 
© © 04 © 04 
© — « © © 
© © © ph c© 

6 6 6 6 6 


© tH ©4 C© lO 
© © © © © 
© pH © 04 

© © pH 'P* 

© t"* © r- 04 
© © t- © © 
6 6 6 6 6 


*-<©©© i-i 
© © © -^ © 
^ © © © © 
oot^oo-^r- 
© © © © 
© © © © © 
6 6 6 6 6 


© 

I 


© ^ © ^ ^ © 

yfi © pH © 04 *-» 

SSSSS 8 


6 6 6 6 6 
I I l I I 


© 

I 


co © r» © t> 04 

© © 04 © t-» © 

© © © © © © 
© © © © © © 
© © © © © © 
© © © © © © 
66666 6 


04 © © 
•—<*—© 
© © 

6 6 6 

+ + + 


© © © pH © 

00 © CO © 
© 04 © h © 

6 6 6 6 6 

+ + + I I 


pH © CO IS © 
© © © © © 
© © © © ^ 

6 6 6 6 6 

till! 


© © CO PH ^ 
04 © PH © © ~H 
© © pH © © O 

66666 6 
I I I I I I 


fO 


*** © t- 
© pH ^0 

© © © © 

© © H ©4 

6 6 6 6 

i I I I 


© 

5 3 © § © 
© © © ^ 
© ^ ^ © 

6 6 6 6 6 

I I I I I 


O O b» b» «D 

^ © r- © © 

© © 04 © © 

© »H © © 

04 «H pH O © 
6 6 6 6 6 
I I I I I 


O © pH © f 
6 © © -^ 
OHhOOOO 
© © © t- PH 

66666 


H H ^ o b» 
04 © 6 "O » 
t- © ©4 © 

© t> © © © 

66666 


HH|>^ W 
y# A to 04 (“ 
© ^ © r- 
© © © I 
t* « n ■ 


> © < 


66666 


© ^ pH © -><H 
© © © © © 
©©«h5 

©§§§i 
6 6 6 6 6 
I I I I I 


© 

I 


^ © © © © 


04 CO © © 

‘1 © I 

' © ( 


04 04 S O O 
© © © © © 
66666 


© © © © © 
© 04 © © © 


I © © © 


© © © © © 
© f4 © « 


© © © © © PHHPMHH C4 C4 04 f4 r* 


© © © © © © 
© 04 ■* © © © 

66666 6 









Table 5 (cont.). Auxiliary quantities required for calculating the probability integral of the studentized extreme deviate (x n — x)js p or (x—x^js, 


142 


Distribution of the extreme deviate from the sample mean 



© o o o o 

© fM © 00 

© © © © © 

© CM ^ © ao 

© © © © © 

© CM "* © 00 

© © © © © 

© CM © 00 

© © © © 

© cm <* © 



© © © © © 


cm cm cm cm cm 

cb rb fb fb cb 

-<* + + + 


M 

00 CO ^ 
© eo (M 
i | © © eo 

h* CM ^ © 

H ^ ^ © 00 

00 CO (N Ip ® 

eo co © uo © 

hjh © in ^ 

© © 00 

CO PH © © 


0 

1 1 © © © 

©OHNH 

© © PH PH pH 

pH pH pH O © 

© © © © 



+ 1 1 

1 f f f + 

+ 1111 

1 1 1 1 1 

1 1 1 1 

© 

H 

e 

© © (M 

© pH 

. , © © © 
© © *+ 

© © © 

cm oo io © co 

»£5 CO h h h 

»Q ao © © iq 

pH © ^ 1- © 

© © © © © 

0-9493 

0*8234 

0-6462 

0-4678 

0-3160 

© pH © IQ IQ 

© © |h eo oo 

© CM © CO pH 

CM ph © © © 

© © © © © 

0-0090 

0-0042 

0-0018 

0-0007 



+ + + 

+ 1111 

1 1 1 1 1 

1 1 1 1 1 

1 1 1 1 


e 

O 

p >o h ^ » 

© o © © o* 

© © t- © 

© © © PH © 

OCOHIO 
© © © © © 

ph IC5 © 1C 00 

I- O h lO »o 

I'p W (N «0 ^ 

00 © 00 h* © 

m © co © 

PH CO h* © r- 

© CO © © © 

CM © T 00 pH 

00 © CM OO © 
h CM ph CO © 

© ph © r- oo 
© © © © © 

PH CM CM © © 

CM © © 05 H* 
Hf<©©CO«H 

CO © © © © 

© © © © © 

© © © © © 

© CM © UQ 
© © 00 © 

© © © © 

© © © © 

© © © © 

© © © © 



© © © © © 

© © © © © 

© © © © © 

© © © © © 

© © © © 


Ci 

IQ « ^ 

1 | © © S3 

io © © r- co 

CO pH CM io PH 

(N IQ ''t 1— CO 

CM © © © © 

© © © CO © 
h* © CM © © 

OO CO OO CO pf 

PH CO PH CO © 

>Q CM © © 'Pf 

co ^ © uo 
^ CO l- co 

CM ph © © 


e 

1 1 © © © 

OOhhh 

©> © p-4 pH pH 

PH pH © © © 

© © © © 



+ 1 1 

1 4 — 1 — 1 — f 

+ 1111 

1 1 1 1 1 

1 1 1 1 



00 00 
©Hio® 

. © PH l> ^ 

© © © H 

»Q Tj< IQ O0 PH 

r-co©<Mi'- 

»- © CO © © 

© PH IO 00 © 

© CM © CM ** 

'Pt ph CO CM © 

© CO © © © 

© I- © h* CM 

^ r- ao IO 1' 

00 © *Q © H* 

© © UO CM PH 

PH © © © © 

© CM h* © 
r- co ph © 

§888 



© © © © 

© © © © © 

© © © © © 

© © © © © 

© © © © 



+ + + + 

+ 1111 

1 1 1 1 1 

1 1 I 1 1 

1 1 1 1 


© 

a 

© pH <44 00 CM 

© (M lO PH CO 

© © 05 CM CO 

© © PH © CM 

; © © © CM 00 

l © © © © © 

h*NCO^H 
© © CO 00 © 

CO (N CO © CO 

N W © ® h 
© © CM t- © 
(NCOiOOI- 

CM © © © ph 
© CO t- 
© © © CM © 

CM © © © © 

I- CM © © 

© © © © © 

© r- IQ © 

co © or cm © 

® i(5 ao oo 

^ oo © © 

© © © © © 

© © © © © 

H* pH © © 

CM f- © © 

© © © © 

© © © © 

© © © © 

© © © © 


i 

i © © © © © 

© © © © © 

© © © © © 

66666 

© © © © 


N l 

pH CO *Q © 
© © © 

| © © PH ©M 

t- © © CM 

© © W t- W 
© 1^ CO CO © 

PH rf- © © © 

© h co oo r> 

© r- cm ^ h* 

© IO © PH 

© CO «5 H (M 

CM © 1" »Q CO 

© © ^ PH 

OO © IQ CM 

PH pH © © 


C3 

I © © © © 

© © pH pH © 

© © pH pH pH 

~ © © © 

© © © © 


; 

-fill 

+ + + + + 

+ 1111 

1 1 1 1 1 

1 1 1 1 



>0 <N (N W 
© © CO 

O — • 00 H 

© © © PH 
© © © © 

© 00 © l> ««* 

© PH oo © 00 
pH © © © CO 

© co © oo 
© © © © © 

© r- © © © 

© CM © © © 

© CO CO CM 

CO CM 

© © © © © 

0-1364 

0-0795 

0-0437 

0-0229 

0-0112 

0-0052 

0-0024 

0-0010 

0-0004 



4- + + + 

1 1 1 1 1 

1 1 1 1 1 

1 1 1 1 1 

1111 


e 

O 

O H lO IO 1(5 

© O <N O 00 
O h OC Q0 

© © ^ »o © 

© © © CO pH 
© © © © pH 

^ © N H ^ 
Hp^COH 

© © r- rfH ^ 

^ © PH CO 

IQ CM 00 CM CM 

CM h* © t- 00 

CO CO CO © CO 
Hh WO© 

© h r- © cm 

CO © © CM pH 

© CO © © © 

© © © © © 

co h* © ae 

CM 00 «& h* IO 

ao © ph © oo 

4Q 00 © © © 

© © © © © 

© © © © © 

© © CO © 
r* 00 05 05 
© © © © 

© © © © 

© © © © 

© © © © 



© © © © © 

© © © © © 

© © © © © 

6666^ 

© © © © 


© © © © © 
© CM © 00 

© © © © © 

© CM © 00 

© © © © © 

© CM ** © 00 

© © © © © 

© CM ^ © © 

© © © © 

© rs v© 


O 

66666 


CM CS CM fS <S 

cb eb fb fb fb 

+ + + + 



K. R. Naib 


143 


Table 6 A. Lower per cent points of the studentized extreme deviate (x n - x)js„ or (x— x i )js v 


V 

5% 

»% 

\ 

3 

4 

5 

6 

7 

8 

9 

3 

4 

5 

6 

7 

8 

9 

10 

0*20 

0-36 

0*46 

0*55 

062 

0*69 

0*74 

0*09 

0*19 

0*29 

0*37 

0*43 

0*49 

0*54 

15 

020 

0-35 

0*46 

0*55 

0*63 

0*70 

0*75 

0*09 

0*19 

0*29 

0-37 

0*44 

0*50 

0*56 

30 

0-20 

0-36 

0*46 

0*56 

0*64 

0*70 

0*77 

0*09 

0-20 

0*29 

0*38 

0*45 

0*51 

0*57 

00 

0*20 

0*35 

0*47 

0*56 

0*65 

0*72 

0*78 

0*09 

0*20 

0*30 

0*38 

0*46 

0*53 

0*59 


B . Upper per cent points of the studentized extreme deviate (x n — x) js 

,or (x 

-*i )/«, 

\n 

y 

\ 

5% 

i% 

\ 

V , 

3 

4 

5 

6 

7 

8 

9 

3 

4 

5 

6 

7 

8 

9 

10 

202 

2*29 

2*49 

2*63 

2*75 

2*85 

2*93 

2*76 

3*05 

3-25 

3*39 

3-50 

3*59 

3*67 

11 

1-99 

2*20 

2*44 

2*58 

2*70 

2*79 

2*87 

2*71 

3*00 

3*19 

3*33 

3*44 

3-63 

3*61 

12 

1-97 

2*22 

2*40 

2-54 

2- 05 

2*75 

2-83 

2*67 

2*95 

3*14 

3*28 

3*39 

3-48 

3*55 

13 

1-95 

2*20 

2*38 

251 

2*62 

2-71 

2*79 

2*63 

2-91 

3-10 

3*24 

3*34 

3*43 

3*51 

14 

1-03 

218 

2*35 

2*48 

2-59 

2*68 

2*76 

2*60 

2*87 

306 

3*20 

3-30 

3*39 

3*47 

15 

1*92 

2-16 

2*33 

2-40 

2*56 

2*65 

2*73 

2*57 

2*84 

3*02 

3*16 

3*27 

3*35 

3*43 

16 

J *90 

2*14 

2-31 

2-44 

254 

2*63 

2*70 

2*55 

2*81 

3*00 

3*13 

3*24 

3*32 

3*39 

17 

1-89 

213 

2*30 

2*42 

2*52 

2*61 

2*68 

2-52 

2*79 

2*97 

3*10 

3*21 

3*29 

3*36 

18 

1*88 

2*12 

2-28 

211 

2*51 

2-59 

2*66 

2-50 

2*77 

2-95 

3*08 

3*18 

3*27 

3*34 

19 

1*87 

2-11 

2*27 

2*39 

2*49 

2-58 

2*65 

2*49 

2*75 

2*92 

3*06 

3*16 

3*24 

3*31 

20 

1*87 

2*10 

2*26 

2*38 

2-48 

2-50 

2*63 

2*47 

2*73 

2*91 

3*04 

3*14 

3*22 

3*29 

24 

1*84 

2*07 

2-23 

2-35 

2*44 

2-52 

2-59 

2-43 

2-68 

2*85 

2*97 

3*07 

3*15 

3*22 

30 

1-82 

2-04 

2*20 

2*31 

2*40 

2*48 

2 55 

2*38 

2*62 

2*79 

2*91 

301 

3*08 

3*15 

40 

1*80 

2*02 

2-17 

2*28 

2*37 

2*44 

2*51 

2*34 

2*57 

2*73 

2*85 

2*94 

3*02 

3*08 

60 

1*78 

1*99 

2*14 

2*25 

2*33 

2*41 

2*47 

2*30 

2- 52' 

2*68 

2*79 

2*88 

2*95 

3*01 

120 

1-70 

1*97 

2-11 

2*21 

2-30 

2-37 

2-43 

2*25 

2-48 

2-62 

2*73 

2-82 

2*89 

2*95 

oo 

1*74 

1*94 

2*08 

2-18 

2*27 

2*33 

239 

2*22 

2-43 

2-57 

2*68 

2-76 

2*83 

2*88 





144 Distribution of the extreme deviate from the sample mean 

BEFEBENCES 

Godwin, H. J. (1945). On the distribution of the estimate of mean deviation obtained from samples 
from a normal population. Biometrika , 33, 254. 

Hartley, H. 0. (1944). Studentization, or the elimination of the standard deviation of the parent 
population from the random sample distribution of statistics. Biometrika , 33, 173. 

Hartley, H. O. (1945). Note on the calculation of the distribution of the estimate of mean deviation 
in normal samples. Biometrika , 33, 257. 

Irwin, J. 0. (1925). On a criterion for the rejection of outlying observations. Biometrika, 17, 238. 

Jones, A. E. (1940). A useful method for the routine estimation of dispersion from large samples. 
Biometrika , 33, 274. 

McKay, A. T. (1935). The distribution of the difference between the extreme observation and the 
sample mean in samples of n from a normal universe. Biometrika , 27, 466. 

McKay, A. T. A Pearson, E. S. (1933). A note on the distribution of range in samples of n. Biometrika , 
25, 415. 

Nair, K. R. (1947). A note on the mean deviation from the median. Biometrika , 34, 360. 

Pearson, E. S. A Chandrasekar, C. (1936). The efficiency of statistical tools and a criterion for the 
rejection of outlying observations. Biometrika , 28, 308. 

Pearson, E. S. A Hartley, H. O. (1943). Tables of the probability integral of the studentized range. 
Biometrika , 33, 89. 

‘Student’ (1927). Errors of routine analysis. Biometrika , 19, 151. 

Thompson, W. R. (1935). On a criterion for the rejection of observations and the distribution of the 
ratio of deviation to sampling standard deviation. Ann . Math. Statist. 6, 214. 

Tippett, L. H. C. (1925). On the extreme individuals and the range of samples taken from a normal 
population. Biometrika , 17, 364. 



[ 146 ] 


THE FISHER- YATES TEST OF SIGNIFICANCE IN 
2x2 CONTINGENCY TABLES 

By D. J. FINNEY 

Lecturer in the Design and Analysis of Scientific Experiment , University of Oxford 

1. Introduction 

One of the tests of significance most frequently required in applications of statistics to 
biological problems is that for the 2x2 contingency table. Briefly, the problem may be 
typified by the following statement: Given two series of observations, with individual 
results assumed independent of one another and classified as either ‘success* or ‘failure*, 
do the proportions of successes observed in the two series differ more widely than might 
reasonably be expected if the population values of these proportions are equal? If the 
expected numbers of successes and failures, calculated from the null hypothesis, are all 
moderately large, a simple and well-known form of x 2 test may be used; this test gives a good 
approximation to the true probability of a deviation from equality as great as that observed, 
especially if the adjustment usually termed ‘Yates’s correction* is applied (Yates, 1934; 
Fisher, 1946, §§ 21, 21-01). 

Not infrequently, however, the expected numbers are too small for the x 2 test to be trusted. 
Fisher has recommended that, if any one of the expectations is less than 5, special steps 
should be taken; Aitken (1944) and Cramer (1946) suggest a minimum of 10. Investigations 
by Cochran ( 1 942) indicate that no simple rule of this kind is completely adequate, and that 
the magnitudes of all four expectations affect the quality of the approximation. Yates (1934) 
and Fisher (1946, §21*02) have given an exact method for the evaluation of the required 
probability, for use when small frequencies make any x 2 approximation unreliable. Their 
method uses the four marginal totals as ancillary statistics: that is to say, the probability 
is obtained as a relative frequency amongst all configurations having the same marginal 
totals. More recent writers, notably Barnard (1947) and Pearson (1947), have questioned 
whether this assumption of fixed marginal totals is logically justifiable for all 2 x 2 contin- 
gency tables; Barnard has discussed the specifications of several distinct problems, and 
maintains that the Fisher-Yates test is applicable to only one of these. Nevertheless, the 
test is undoubtedly the right one for at least one important class of problem, and for this 
may be described as an exact method, f 

The Fisher-Yates method requires rather tedious calculations, though the labour can be 
much reduced by use of a simple table of the logarithm of the factorial function, such as that 
given by Fisher & Yates (1948, Table XXX). The purpose of the present paper is to present 
a table of significance levels for their exact test which is applicable when all the frequencies 
are small. Yates (1934) and Fisher & Yates (1948, Table VIIJ) give a concise table which, 
with the aid of a small amount of calculation, leads to a good approximation to the exact 
probability; the new table provides tests of significance only, but has the advantage of 
direct entry instead of preliminary calculation. Rules for the use of the table are given in 
§2, in sufficient detail for the non-mathematical reader; a brief account of the method of 
construction follows in § 3. 

t The probability associated With any of the classes of problem distinguished by Barnard is satisfactorily 
approximated by that obtained from x* when all expected frequencies are large. 

Biometrika 35 xo 



146 


The Fisher - Yates test of significance in 2x2 contingency tables 

2. Thb table of significance levels and its use 
The table which follows may be used to test the significance of the deviation from pro- 
portionality in any 2x2 contingency table having both frequencies in one of its margins less 
than or equal to 15. The contingency table must first be put in the form 



Number of 





Number of 




observations 


Successes 

Failures 


Series I 

a 

A— a 

A 

Series II 

b 

B-b 

B 

Total 

a + b 

A -f- B — a — b 

A+B 


where Series I is defined to be that which makes A > B, and the type of observation con- 
ventionally regarded as a ‘ success ’ is that which makes a/ A ^ bjB. For small integers, the 
right arrangement can usually be made on sight, especially if the second condition is used 
in the form aB>bA. 

Providing that A ^ 15, the table at the end of this section may then be entered in the section 
for A, the sub-section for B, and the line for a. If b is equal to or less than the integer in 
the column headed 0-05 or 0-01, a/ A is significantly greater than bjB (single-tail test) at the 
probability level 0-05 or 0*01 respectively. If 6 is equal to or less than the integer in the 
column headed 0-025 or 0-005, a/ A is significantly different from b/B (two-tail test) at the 
probability level 0-05 or 0-01 respectively. A dash, or absence of any entry, for some 
combination of A, B , and a indicates that no contingency table in that class is significant. 
The probability corresponding to b will generally be less than that shown at the head of 
the column, and the true numerical values are shown in small type. Those in the 0-025 and 
0-005 columns should be doubled if used in a two-tail test. 

From the mathematical point of view, the two pairs of marginal totals play equivalent 
roles in the test. Consequently, if only one set satisfies the condition that neither total exceeds 
15, that margin may be conventionally regarded as relating to the two series, and the other 
classification must then be taken as ‘success’ and ‘failure’ (see example below). If both 
margins satisfy the condition, the two possible arrangements will lead to identical 
conclusions. 

The use of the table presented here may be extended by noting that any contingency 
table more extreme than one known to show significant deviation from proportionality 
must itself be non-significant. Thus, even though one of each pair of marginal totals exceeds 
15, a test of significance without calculation may still be possible. In particular, if the 
standard arrangement of the contingency table gives 


a 

A —a 

A 

b 

B-6 

B 

a + b 

A+B-a-b 

A+B 




147 


D. J. Finney 

with A > 15 > B and aB > bA, the deviation from proportionality will be significant if 


15 +a-A 

A— a 

15 

b 

B-b 

B 

154-a-f- 6 — A 

A+B-a-b 

15 + B 


is significant (but not necessarily non -significant otherwise), and the deviation will be 
non-significant if 


a 

15 — a 

15 

b 

B-b 

B 

a-f6 

154 -B—a—b 

154-# 


is not significant (but not necessarily significant otherwise). 

As an example of the use of the table, Lange’s data on criminality among twin brothers or 
sisters of criminals (Fisher, 1946, §21*01) may be examined. The contingency table below 
shows the numbers of twin brothers or sisters of criminals who had also been convicted, 
separately for monzygotic and dizygotic (but like-sexed) twins. 



Not convicted 

Convicted 

Total • 

Dizygotic 

15 

2 

17 

Monozygotic 

3 

10 

13 

Total 

18 

12 

30 


Since 1S/17 > 3/13 the category ‘not convicted’ is regarded as success. Consider the 
contingency table 


13 

2 

15 

3 

10 

13 

16 

12 

28 


in which A = 15, B = 13, and a = 13. The null hypothesis is that, in the general population, 
freedom from conviction is equally frequent amongst dizygotic and monozygotic sibs. If 
the only deviation from the null hypothesis which the investigator is prepared to consider 
is that monozygotic twins behave more similarly than dizygotic, he will require a one-tail 
significance test. The table below shows 4 as the 0*01 significance level of b (with a true 
probability of 0*004, not 0*010). Hence 6 = 3 is significant: by the rule given, the deviation 
from proportionality in the original table is significant evidence that criminality is more 
frequent among monozygotic twins of criminals than among dizygotic twins of criminals. 
Fisher gives the exact probability for the original data as 1 /2150. On the other hand, if the 
investigator were concerned only to demonstrate a significant difference between the 
frequencies of freedom from conviction for the two types of twin, irrespective of whether 
monozygotic or dizygotic should show the higher value, he would use a two-tail test; for 



148 The Fisher- Yates test of significance in 2x2 contingency tables 

A = 15, B = 13, a ■» 13, the value of 6 in the 0-005 column is again 4 (though generally it 
will be less than for the one-tail test), and again the evidence against the null hypothesis is 
judged significant, both for the modified and for the original contingency table. 

As an illustration of the interchangeability of marginal totals, the same data may be 
examined by arranging the contingency table as: 



Dizygotic 

Monozygotic 

Total 

Convicted 

15 

3 

18 

Not convicted 

2 

10 

12 

Total 

17 

13 

30 


in which ‘dizygotic’ is classed as success. Logically the meaning may be different, but the 
test of significance is the same. The modified contingency table 


12 

3 

15 

2 

10 

12 

14 

13 | 

27 


has A — 15, B = 12, a — 12; 6 = 2 is judged significant by comparison with the tabular 
values 3 at 0-01 or 2 at 0-005, and therefore again significance is attained on either the one-tail 
or the two-tail test. 

The table follows on pp. 149-54. 

3. Construction of the table 

Any 2x2 contingency table to which a test of significance is to be applied must first be 
written in the standard form shown at the beginning of § 2, in which the event regarded as 
a success and the arrangement of the rows are so chosen that A*tB and a/ A > b/B. If 
ajA = b/B, there is no need to proceed further, as the data then accord perfectly with the 
null hypothesis; if a/ A > b/B, of necessity a > b. On the null hypothesis that the population 
values for the proportions of successes are equal, and for the specified set of marginal totals, 
the probability of this set of frequencies is (Fisher, 1946, § 21-02): 

P b = P(b\A,B,a + b) 

= A\B\(a + b)\{A + B-a-by. __ 1 

(A + B)\ X a ! b\(A - a) ! (B-b)l' 

The first factor in this expression is dependent only on the marginal totals, the second 
depends upon the internal cell frequencies. The probability that a deviation from equality 
of the two proportions as great as or greater than that observed sho t uld occur by chance is 
then P$, where p* = p* (6 1 A> B,a + b) 

— Pb + -P&-1 + Tb-t + - - - + -Pfc. 

and k is the greater of the two quantities 0 and (a + b— A). In the summation the four 
marginal totals are kept fixed. Barnard (1947) has discussed the logic of the test in detail. 
He distinguishes certain classes of problem for which, in his view, the method described here 
is inappropriate; this topic has been a source of some controversy, to which the present 
paper, is not intended as a contribution. 




Table of significance levels of b 

(Values of b in bold type; corresponding probabilities, Pi in small type) 


Probability 


Probability 


0*05 0*025 001 


0*05 0 025 0*01 












Table of significance levels of b ( continued ) 



005 0025 001 


1 •005" 
1 023 

*010+ I 0 <010+ 


1*005- A= 10 B=4 
0 003 


1 *011 
1 *041 
0 * 015 - 

0 * 035 - 

1 *038 
0 *014 
0 035 “ 

0 015 + 
0 045 + 


A= 1 1 B== 1 1 


3 003 
2 005 - 
1 *004 
0 002 
0 *008 
>022 I — 


7 

•045 

6 

•018 

5 

•032 

4 

•012 

4 

040 

3 

• 015 - 

3 

•043 

2 

• 015 ** 

2 

•040 

1 

•012 

1 

•032 

0 

•006 

0 

•018 

0 

•018 

0 

045 + 

— 


6 

• 035 + 

5 

•012 

4 

•021 

4 

•021 

3 

•024 

3 

•024 

2 

023 

2 

•023 

1 

•017 

1 

•017 

1 

•043 

0 

•009 

0 

•023 

0 

•023 

5 

•026 

4 

•008 

4 

•038 

3 

•012 

3 

•040 

2 

•012 

2 

035 - 

1 

•009 

1 

•025 ~ 

1 

* 025 “ 

0 

•012 

0 

•012 

0 

•030 

— 


4 

•018 

4 

•018 

3 

•024 

3 

•024 

2 

■022 

2 

•022 

1 

•015 

1 

• 015 - 

1 

•037 

0 

■007 

0 

•017 

0 

•017 

0 

•040 

— 


4 

• 0^3 

3 

•Oil 

3 

•047 

2 

013 

2 

•039 

1 

•009 

1 

■ 025 - 

1 

• 025 - 

0 

010 + 

0 

• 010 + 

0 

•025 

0 

■ 025 * 

3 

029 

2 

006 

2 

028 

1 

005 + 

1 

018 

1 

018 


3 * 005 - 
2 006 
1 * 005 - 
0 002 
0 007 


2 *002 2 002 

1 002 1 002 

1 *009 0 001 

0 *004 0 004 


2 006 1 *001 

1 *005+ 0 *001 

0 *002 0 *002 









Table of significance levels of b (continued) 



Probability 


005 0025 0-01 0*005 




3 050 - 

2 045 “ 

1 034 
0 019 

0 047 

7 037 

5 024 

4 029 

3 030 

2 026 

1 019 

1 045 - 

0 024 

6 '029 

5 043 

4 048 

3 *046 

2 038 

1 026 
0 *012 

0 *030 

5 *021 

4 *029 

3 *029 
2 *024 

1 016 


0 007 


1 003 1 003 

0 *001 0 001 

0 * 005 “ 0 005 - 


1-009 0 001 

0 004 0 -004 


0 -003 0 003 


A= 12 B = 9 I 


A= 13 B= 1 3 13 

12 
11 
10 
9 
8 


1 037 
0 *017 

0 039 
5 *049 
3 *018 

2 *015+ 

2 *040 

1 '025- 
0 - 010 + 

0 *024 

4 *036 

3 *038 

2 029 

1 017 

1 040 
0 016 

0 034 

3 *025 “ 

2 022 

1 013 

1 032 
0 011 
0 *025- 

0 050- 

2 015“ 

1 -oio- 
1 *028 
0 *009 
0 *020 
0 *041 


4 *014 

3 *018 

2 * 015 + 
1 oio- 

1 * 025 - 
0 * 010 + 
0 *024 


3 * 025 - 
2 022 
1 013 
0 * 005 “ 
0 011 
0 * 025 - 


2 * 005 - 
1 *004 
0 *002 
0 * 005 - 


1 007 1 *007 

0 003 0 *003 

0 *008 0 *008 

0 019 — 

0 *002 0 *002 

0 *009 0 009 

0 *022 — 

0 011 — 







Table of significance levels of b (continued) 


Probability 


12 13 

12 
11 
10 
9 
8 
7 
6 
5 

11 13 

12 



0*05 

0*025 

2 *048 

1 *015+ 

1 *037 

0 *007 

0 *020 

0 *020 

0 *048 

— 

8 *039 

7 *015- 

6 *027 

5 oio- 

5 *033 

4 *013 

4 *036 

3 *013 

3 *034 

2 *011 

2 *029 

1 008 

1 *020 

1 *020 

1 046 

0 oio- 

0 024 

0 *024 

7 031 

6 oil 

6 *048 

5 *018 

4 *021 

4 021 

3 *021 

3 -021 

3 *050“ 

2 *017 

2 *040 

1 011 

1 *027 

0 *005- 

0 013 

0 *013 

0 *030 

— 

6 *024 

6 *024 

5 *035- 

4 *012 

4 *037 

3 *012 

3 *033 

2 *010+ 

2 *026 

1 006 

1 *017 

1 -017 

1 -038 

0 007 

0 *017 

0 -017 

0 *038 

— 

5 *017 

5 *017 

4 *023 

4 *023 

3 *022 

3 *022 

2 *017 

2 *017 

2 *040 

1 *010+ 

1 025- 

1 *025- 

0 *010+ 

0 *010+ 

0 *023 

0 ’023 

0 *049 

— 

5 *042 

4 *012 

4 *047 

3 014 

3 *041 

2 oil 

2 *029 

1 *007 

1 *017 

1 017 

1 *037 

0 006 

0 015- 

0 *015- 

0 *032 

— 

4 *031 

3 -007 

3 *031 

2 *007 





Probability 


0025 0 01 0*005 



2 004 2 -004 

1 003 1 003 

1 oio- 0 ooi 

0 003 0 *003 

0 008 — 


1 002 1 002 

1 *008 0 001 

0 002 0 002 

0 007 — 























Table of rignificanee levels of b ( continued ) 















Table of significance levels of b (continued) 




Probability 

















155 


D. J. Finney 

The table given in § 2 enables tests of significance, at probability levels of 0*05, 0*025, 0*01 
and 0*005, to be made by direct reference for any 2x2 contingency table having B < A < 15. 
The seotion of the table for any particular pair of values A, B was constructed by giving 
to n = o+6 successively all integral values from ( A + B) down to zero; for each », 6 was 
given in turn the values k, &+ 1, k + 2, ... (where k is the greater of the two quantities 0 and 
n—A),P b was formed and the cumulative sum Pf was recorded. The calculations with any 
set of A, B,n were stopped as soon as P* exceeded 0*05 or would obviously exceed 0*05 
for the next higher value of b. The calculations were tedious, but by no means as lengthy 
as might appear from this account, for after a little experience extreme values of n, which 
would always give PJ > 0*05, could be ignored, and trends in P * often enabled values of b 
which would exceed the limit to be foreseen without calculation of P b . Of course, in any case 
of doubt, P% was evaluated. 

The probabilities were determined to five places of decimals, but any full publication of 
these would occupy a great deal of space. For many practical purposes, all that is required 
is a test of significance at an arbitrary level of probability. This can be conveniently made by 
use of a table showing, for each combination of A , B, and a, the greatest value of b for which 
P b lies beyond the chosen significance level. The values of b required can be seen immediately 
from systematic inspection of the calculations just described. For example, the calculations 
show the following 2x2 configurations with their associated values of P*. 


8 

3 

11 

8 

3 

11 

8 

3 

11 

8 

3 

11 

3 

7 

10 

2 

8 

10 

1 

9 

10 

0 

10 

10 

11 

10 

21 

10 

11 

21 

9 

12 

21 

8 

13 

21 

p; 

= 00035 

PI 

= 0-0226 

PI 

= 0-0058 

PI 

= 0-0008 


Consequently P*(b 1 11, 10, 11) < 0*05 when 6^2, 

P*(6 1 11, 10, 10X0*025 when 6 <2, 

P*(b 1 11,10,9X0*01 when 6<1, 
and P*(6 1 11,10,8X0*005 when 6 = 0. 

These values of 6, 2, 2, 1, and 0, are therefore tabulated in § 2 under the appropriate prob- 
abilities. The P* corresponding to each tabulated 6, which is usually less than the significance 
level, is shown in small type; for any smaller 6, P* will be even lower. From the existing 
calculations, of course, significance levels for any other selected probability less than 0*05 
could easily be read, but 0*05, 0*025, 0*01 and 0*005 seem sufficient for the present. ' 

The standard tables of the x* distribution (Fisher & Yates, 1948, Table IV), provide 
a two-tail test when applied to a 2 x 2 contingency table. That is to say, on the null hypothesis 
that the proportion of successes in the two populations compared are equal, and assuming 
that frequencies are sufficiently large for the sampling distributions of the proportions to be 
taken as normal, the test is based on the probability of obtaining a difference as great as or 
greater than (a/ A —b/B) in either direction under conditions of random sampling. If a single- 
tail test is wanted, it can be obtained simply by entering the x 2 table in the column corre- 
sponding to twice the level of significance. For small frequencies, however, the discrete 
nature of the distribution for configurations with fixed marginal frequencies cannot be 
ignored, and unless either A = B or 2» = (A + B) this distribution is not symmetrical. The 
exact test of significanoe.described at the beginning of this section is a single-tail test, being 



156 The Fisher- Yates test of significance in 2x2 contingency tables 

based on the probability of a deviation from proportionality as great as or greater than that 
observed and in the direction of the observed deviation. In general, no deviation in the 
opposite direction will have exactly the same probability. For example, for the configuration 


10 0 

10 

4 5 

9 

14 5 

19 


the probability is . P* « P « - 0-0108. 

No deviation in the opposite direction can be regarded as the equivalent of this either in 
the sense of having the same difference in observed proportions or in the sense of having the 
same probability; the extreme configuration at the other tail is 


5 5 

10 

9 0 

9 

14 5 

19 


for which P* — P = 0-0217. 

* 

If a two-tail test is wanted when the frequencies are small, a new convention must first be 
introduced. The only satisfactory procedure, which is consistent with the practice for 
larger frequencies (when the x 2 distribution is used), is to regard the two-tail probability as 
given by 2P*. In other words, the two-tail significance tests corresponding to single-tail 
tests at the conventional probability levels 0-05 and 0*01 are given by comparing P* with 
0-025 and 0-005 respectively. It is this consideration which governed the choice of 
probability levels for tabulation in § 2. 

4. Summary 

This paper contains a table from which a test of significance of the deviation from pro- 
portionality in a 2x2 contingency table, based upon the Fisher-Yates exact probability 
method, can be read directly. Any 2x2 contingency table for which neither member of one 
pair of marginal totals exceeds 15 can be tested in this way, and a simple rule extends the 
applicability of the table to certain contingency tables with larger frequencies. The method 
of construction of the table is described. 

I am indebted to a number of past and present members of my staff* for the extensive and 
tedious calculations on which the table is based, especial thanks being due to Miss M. Callow. 

REFERENCES 

Aitken, A. C. (1944). Statistical Mathematics , 3rd ed. Edinburgh: Oliver and Boyd. 

Barnard, G. A. (1947). Significance tests for 2 x 2 tables. Biometrika , 34, 12& 

Cochran, W. G. (1942). The x* correction for continuity. Iowa St . Coll. J. Sci . 16, 421. 

Cram&r; H. (1946). Mathematical Methods of Statistics . Princeton University Press. 

Fisher, R. A. (1946). Statistical Methods for Research Workers , 10th ed. Edinburgh: Oliver and Boyd. 
Fisher, R. A. & Yates, F. (1948). Statistical Tables for Biological , Agricultural and Medical Research , 
3rd ed. Edinburgh: Oliver and Boyd. 

Pearson, E. S. (1947). The choice of statistical tests illustrated on the interpretation of data classed 
in a 2 x 2 table. Biometrika, 34, 139. 

Yates, F. (1934). Contingency tables involving small numbers and the x 1 test. J. Roy. Statist. Sac. 
Suppl. 1, 217. 



[ 157 ] 


THE POWER FUNCTION OF THE TEST FOR THE 
DIFFERENCE BETWEEN TWO PROPORTIONS IN 

A 2x2 TABLE 

By P. B. PATNAIK, University College , London 
1. Introduction 

Neyman & Pearson’s (1933) conception of the power of a test of a statistical hypothesis, H 0 , 
was developed, in the first instance as a means of guiding the choice between alternative 
tests. This, it was shown, could be done by comparing the effectiveness of the tests in 
discriminating between H 0 and a set of admissible alternative hypotheses regarded as most 
relevant to the question under test. Where there is no doubt about the most appropriate 
test and no sequential scheme of sampling is possible, the power function may play a useful 
part in indicating, before the data are collected, how large the samples should be to avoid 
an inconclusive result. If this procedure is to be easily applied, a ready means must be 
available of calculating the power of the test for a given significance level and sample size. 
The tables of the power function of the J-test (Neyman & Tokarska, 1930) and Tang’s 
Tables (1938) applicable in the Analysis of Variance, are examples of such aids. The present 
paper aims at providing in simple, if approximate, form a means of determining the power 
function of the test for the difference between two proportions. 

The test may be briefly outlined as follows. In two ‘ infinite ’ populations the proportions 
of individuals possessing a character A are p x (A ) and Pt{A) respectively. Random samples 
of m and n are drawn from the two populations and the result is represented in a 2 x 2 table, 
thus: 


Table 1 . Number of individuals 



With A 

Without A 

Total 

Sample from first population 

a 

c 

m 

Sample from second population 

b 

d 

n 

Total 

r 

8 

N 


We may then wish to apply the test in two forms: 

(i) The ‘two-sided’ form. To test H 0 that p x = p 2 , bearing in mind the two-sided alter- 
natives that p x < p % and p x >p%. 

(ii) The ‘one-sided’ form. To test H 0 that Pi^p t , bearing in mind the one-sided alter- 
natives p x >p t - These will be referred to in the following sections as cases (i) and (ii). 

In obtaining a solution we shall regard the sample space as two-dimensioned as in 
Pearson’s (1947) Problem II or Barnard’s (1947) ‘2x2 comparative trial’. Fig. 1 (a) roughly 
illustrates the sample space for the case m => n = 60. A possible sampling result is represented 
by a point (a, b) in this lattice of 51 x 51 = 2601 points. Sample points having a+b — r are 



158 


The power function in a 2x2 table 

said to lie on the ‘diagonal ’ r of the lattioe. In the ease of small samples, or even for large 
samples when (a, b) falls near the (0, 0) or (to, n) comers of the lattioe, considerable difficulties 
arise owing to the discontinuity of the distribution. We shall be concerned in the first place 
with examining the case where it is justifiable to apply the test by referring to the normal 
probability scale the ratio 




J 


mnrs 

N*(N-l) 


a b 
to » 




> 


(ad — be) <JN 
^(mnrs) 


(approx.), 


( 1 ) 


where the second and third alternative forms are equivalent to the first if the factor N — 1 is 
replaced by N. The first form brings out the fact that u is the ratio of the deviation from the 
mean to the standard deviation, in the hypergeometric series representing the conditional 
distribution of a for r fixed. The second form arises from the classical approach in which 
a difference of observed proportions is compared with an estimate of its standard error. The 
third, when squared, gives the commonly quoted form of the expression for ^*, with one 
degree of freedom, in a 2 x 2 table. 




Write u a for the 100a percentage point of the normal distribution N( 0,1) i.e. 

< 

r_L_e-*«* = a 

then using the two-sided test (i), we should reject the hypothesis that p x = p 2 at the signi- 
ficance level a when | u \ > u^ a ; in the case of the one-sided test, we should reject the hypo- 
thesis that p x $ p s at the same significance level when u > u a . Thus, in the example illustrated 
in Fig. 1 (a), it is seen from the position of the sample point, that H 0 would not be rejected at 
the 6 % level if the test is in form (i), but would be rejected at the 5 % level if it is in form (ii). 





P. B. Patnaik 159 

The boundaries of the oritical region associated with the test, taking say, case (i), are 
formed by obtaining from the equation 

mr 

a ~r 

/ mnra “* a 

aJn*(n- i) 

the ‘ out-off’ points (a, r — a) on each diagonal r in the lattice and joining them as in Fig. 1 (a). 
Subject to the error involved in the approximation, the chance is a that the sample point 
falls in the oritical region when p x and p t have a common, though unknown, value i.e. in 
Neyman & Pearson’s notation, P{E e w a \ p x = p 2 } = a. If in factpj+p,, then the power of 
the test of H 0 with regard to the alternative hypothesis H 1 (p 1 ,p t ) is the chance that the point 
(a, b) falls in the critical region when sampling from populations with proportions p x and p 2 . 
Or, formally, P{Eeu> tt \p 1 ,p 2 } = P{|«| >«*« |Pi,Pj} for case (i), 
or = P{u>u a \p x ,p%s for case (ii). 

This is the total probability density at all the discrete points (o, b) included in the critical 
region. The problem is to express this in a readily calculable form. 

If this is done, two types of application are evident: 

'Vhen the decision has been made to take two samples of, say, 50, or when the 
available data happen to consist in samples of this size, we may ask, ‘ what is the chance that 
the test described will show a difference in observed proportions significant at the 5 % level 
when in fact, p x and p 2 are as different, say, as 0-50 and 0-65? ’ 

(2) On the other hand, we may use the theory to ask in advance how large the samples 
should be so that the risk of failing to detect a given difference* between p x and p 2 which 
is considered to be of importance, shall be acceptably small. For example, we may ask, 

‘ what sizes of samples should we take so that in applying our test we may have a high, say 
a 90 %, ohance of detecting that the proportions are not equal when, in fact, they are as 
different as 0-50 and 0-65 ? ’ 

It is clear that for given m, n and a the power of the test will be constant on certain 
contours in the p v p t space such as those shown in Fig. 1 (6). We shall examine the approxi- 
mate form of these contours and show that in the important case when m = n = \N, the 
family of contours is independent of N and a, although the power associated with a particular 
contour will be a function of N and a, which has been tabled. Throughout the investigation, 
approximations are made of the type involved in representing binomial or hypergeometric 
distributions by normal distributions. The adequacy of these approximations is examined 
in certain cases. 

2. Exact values fob the power function of the test based on the ratio u 
Since a and b are independent, it follows that for the population proportions, p x and p 2 

Tth I « | 

2>(«, ft) = ?f *b\d \ tel * 1 -Pv = 1 -2>s)- (2) 

If our test consists in rejecting H 0 (p 1 = p t ) when | u | > u ix , its power with respect to an 
alternative H 1 (p 1 +p a ) is the sum of the values of p(a, b) for all points (a, b) lying in the critical 
region. Exact calculation is laborious, since it involves the multiplication of terms of one 
binomial by the sums of terms of the other, which may be obtained from the tables of the 
* Neyman & Pearson have termed this the error of the second kind. 



160 Thepotverfunctionina2>i2Utbk 

Incomplete Beta Function. In the special case m - 18,» » 12, discussed in Pearson’s paper, 
exact values of the power function have been calculated for p x ,p% = 0*1(0*1)0*9 for the 
two-sided test and for significance levels 10 and 2 % ; they are tabulated in Tables 2(a) and 


Table 2(a). Power function for case m — 18, n = 12. Significance level a = 0*10 
Approximate values are shown in parentheses 


























P. B. Patnaik 


161 


2 (6).* When p 1 = p t the value of the power function shoitld reduce to a. But in these tables 
we see that the values on the diagonal, p x = p t are not exactly equal to 0-10 or 0*02. The 
discrepancy is due to the fact that continuous approximations have been made to the 
discontinuous distributions in the formulation of the test. 

Interpolation between the calculated values in Table 2 (a), leads to the power contours 
shown in Fig. 2. The test has the same power of establishing significance when sampling 
from any populations for which p lf p x lies on a given contour. The chances of establishing 
significance are written alongside the contours. The surface for which the ordinates at p v p t 
are equal to the power, may be called the power surface of the test. 



0 0-1 0-2 0-3 0-4 0-5 0-6 0*7 0-8 0-9 1-0 

Values of p x 


Fig. 2. Power contours (special case m = 18, n = 12) for significance level 0*10. 

We can see from the contour diagram that with samples of this size (18 and 12), even when 
the p' s are as different as 0*6 and 0*3 there is only a 50 % chance of establishing significance. 
If instead of the high level a = O’ 10 employed here, a lower level were chosen, then the p* s 
must differ even more to give the same chance of establishing significance. Thus the diagram 
illustrates well how inadequate small samples are to establish what would ordinarily be 
regarded as a difference of some importance. 

It can also be seen that \p 1 —Pi\ nearly constant on a power contour near the 
middle of the square; e.g. the difference is roughly 0-3 on the 0*5 contour within the range 
(0-45, 0*15) and (0*75, 0*45). 

* For these calculations, I am indebted to the members of the Statistics Department of University 
College, London, in particular to Mr V. D. Gangolli. 


Biocnetrika 35 


zx 



162 


The power function in a 2x2 table 


3. Approximations to the power function 
We will now consider an approximation to the distribution of a under the hypothesis H v 
under whioh the population proportions are p t and p 8 . From ( 1 ) 

N\ 


p(a,r) = 


-fr! (9i\ m „ / fElsElrl*! 1 /p t g a \ a 
rl8\ Piqi \qJ i^!a!61c!d!/\p,g 1 / * 


If we replace the hypergeometrio term in the curled brackets by the ordinate of a normal 
curve having the mean and s.D. of the hypergeometrio series, then 


P(“, r ) = 


w(a)*x 

w 


V(2 ”)j 


mnrs 
N*(N-l) 


-exp 


/ rm\* ~ 

(-ir) 

X (M»\ 

n mnr8 

2 N*(N- ljJ 

\pt<h) 



Writing as exp |jz log e j J > collecting the terms containing a and making 

© 


a perfect square, we obtain p(a, r ) = p(r) x p(a | r), 

- • /<7Am exp f — wMj + _ ( 

P U % a ?i + 2N*(N-l)V 0fU p a qJ J* 


where 


p(r) 


and 


p(a|r) = 


1 




mure 
N\N- 1) 


-exp 


( rm 


mnrs . a* 


)• 


mnrs 


(3) 


(4) 


N»(N- 1) 

Thus (4) is the approximate conditional distribution of a on the diagonal r — a + b and is 
seen to be normal, with rm 


mean 


mnrs , p, </. 
T + JV l (iV-l) 0ge ^?i 


and 


Defining 


~-y 


mnrs 

N\N-l)' 

T171 


a— 


u 


N 


mnrs 

N*(N-1) 


as in (1), equation (4) becomes 


1 


J»(«l r ) = 17^ ex P 


V(2u) 

1 


y 


■[-i^-y 


mnrs 




where 


V(2u) 

... ./ mnrs 


e-Ku-AW? 


loe Ml 

^-l) IOg ‘p 4?1 ’ 


and is a function of r only, since a( = N —r) and the other quantities are given. 
If and p 2 are equal, then (5) reduces to the distribution of u under H 0 , 

1 


p(u) = 






( 5 ) 

(6) 

(7) 


which is the normal approximation UBed in obtaining the test criterion^ This distribution 
is independent of r; it is this fact that makes the critical regions on each diagonal r similar 
regions which can be combined into a single region in the two-dimensioned sample spaoe. 



P. B. Patnaxk 


103 


But the distribution of u under H v given by (5) is not independent of r. It is normal, 
with the same s.d. as for (7), i.e. unity, but with its mean shifted by h(r). 

What may be termed the ‘conditional power’, for r fixed, with regard to #i(fi+P») is 
then for case (i), 

, , f— foo i A(r) 

P(| « I > U u I r ) = p{u\r)du+ p(u\r)du= 1 — j—- e~* ut du, (8) 

J ~ ® J U* a V( 27r / j 


r°° i r® 

and for case (ii), P(u>u a \r) = I p(u\r)du=* lT - - I e-^du. (9) 

Ju a *jK*ir)ju*-h{r) 

Since p(u) = j>(t^|r) p(r), the ‘over-all’ power function or, simply, the power function is 


/•oo 

P{\u\>u u \r)p(r)dr 

J — 00 


( 10 ) 


for case (i) with a similar expression for case (ii). It is clear that even with the simplified 
expression (8) for the conditional power function, depending on h(r), the labour involved 
in calculating the over-all power (10) would in general be prohibitive. Some approximation 
is therefore required for p(r), i.e. the expression in (3). The simplest approximation is obtained 
by assuming r to be normally distributed. Since a and b are distributed binomially with 
means mp v and np 2 and s.D.’s mp 1 q 1 and np z q 2 , r = a + b can be considered as distributed 
normally with mean = mp 1 + np i and S.D. = ^l{mp 1 q 1 + np t q 2 ). That is, 


v(r) l r (r-(«*Pi + np,)) r | ai % 

'J( 2 *T)<j(™Piqi + np t q i ) P L 2(mp 1 ? 1 + np a ? 2 )J' v ; 

Hence, the expression (10) for the over-all power function becomes 

- j rzr \ 77 — f P(|tt|>tt 4a |r)xexpr--^ ~ — (12) 

J&rf^mPiqi + nptqjJ-ao 1 1 1 L 2 mp x q x + np 2 q 2 J 

for case (i) with a similar expression for case (ii). 

Though problems may arise where the conditional power function would be useful, 

clearly it is the value of the over-all power that will help in determining in advance of the 

experimental result how large the difference between p 1 and p 2 must be for the standard 

test to have a given chance of establishing significance. For in such a preliminary survey, r 

cannot be regarded as fixed. 


4. Evaluation of over-all power 

The over-all power is a function of p x and p 2 , for given m, n and a and may be written as 
P(px,p% | m,n,a). Similarly, the conditional power function may be written as 

They will be denoted here for simplicity by /? and fl(r) respectively. 

Suppose /i x =* mp x + np 2> er a = fi 2 = mp x q x + np 2 q 2 and etc. equal the higher 

moments of the normal distribution (11). Then for case (i) or case (ii), from (12) 

Expanding /3(r) by Taylor’s Theorem, 

m = + /?>,)+ ... (13) 


11-2 



164 The power function in a 2x2 table 

and substituting in above, obtain 


1 R f (n \ ^ 1 

poo 

exp 

' — 00 

r 

(H 

5*. 

1 

K 

^1 

1 

+ ^ (i “ l) V(27r)<rJ 

L 2a® J 

. F(fti) 1 j 

•ao 

r (r-* i) 8 ! 

2! yJ(2n)or J 

exp 

— 00 

L 2a* J 


(r-pjdr 


atsiU^K xW*K , 

= P(Mi) + 2 | + ~ 4 I /*4 + • • •• 

It follows that a first approximation to the over-all power /? is 

■ AW = 1 - -jjxz; «'* u ** for case (») 

V(27T) J 

i r°° 

or = I e“* u * dw for case (ii) 

substituting for r in the expressions (8) and (9). 

A second approximation will be derived and discussed later in seotion 10. 


(14) 


(16) 

(16) 


5. COMPARISON OF EXACT AND APPROXIMATE VALUES OF THE POWER FUNCTION 

Taking the two-sided test, its over-all power has been considered in three special caseR. 
Using the first approximation, the values of /? have been calculated for m — 18, n = 12 and 
are shown in parentheses below the exact values in Tables 2(a) and 2(6). The exact and the 
approximate values have also been calculated for m - n — 15 and m = n = 30 for a few 
combinations ofp 1 ,_p 2 > selected so as to give high power which, as we shall see later, is in the 
range we are most interested in. These are given in Tables 3 (a) and 3 (6). In the latter case, 
since N is large, N — 1 is replaced by N in the ratio u. 


Table 3(a). Showing the power of the two-sided test, m — n = 16 


Pi 

Pt 

Significance level, a 

0*10 

0*02 

Exact 

value 

First 

approx. 

Second 

approx. 

Exact 

value 

« 

First 

approx. 

Second 

approx. 

0-3 

0*4 

0*141 

0*16 

0*154 

0*034 

0*04 

0*041 

0-6 

0*8 

0*306 

0*34 

0*334 

0*112 

0*14 

0*133 

0-1 

0*3 

0*389 

0-44 

0*422 

0*149 

0*20 

0-193 

0-2 

0*7 

0*896 

0*92 

0*912 

0*680 

0*76 

0*750 

• 0*05 

0*5 

0*916 

0*97 

0*964 

0-736 

0*90 

0*876 

0*1 

0*6 

0*919 

0*96 

0*953 

0*739 

0*86 

0*844 

0*2 

0*8 

0*974 

0*98 

0*982 

0*872 

0*93 

0*923 

01 

0*7 

0*980 

0*992 

0*991 

0*894 

0*96 

0*954 










P. B. Patnaik 

Table 3(6). Showing the power of the two-aided teat, m *= n = 30 


165 






Significance level, a 


* 




0*10 



0*02 


Pi 

Pt 









Exact 

First 

Second 

Exact 

First 

Second 



value 

approx. 

approx. 

value 

approx. 

approx. 

0*05 

0*3 

0*884 

0*93 

0*904 

0*631 

0*78 

0*762 

0*1 

0*4 

0*885 

0*91 

0*903 

0*691 

0*75 

0*736 

0*3 

0*7 

0*937 

0*95 

0*947 

0*807 

0*83 

0*824 

0*2 

0*6 

0*945 

0*90 

0*957 

0*839 

0*86 

0*852 

0*1 

0*5 

0*977 

0*988 

0*985 

0*902 

0*94 

0*934 

0*2 

0*7 

0*993 

0*996 

0*996 

0*905 

0*98 

0*974 


From these tables it can be seen generally that (1) the first approximation over-estimates 
the power, (2) the agreement is better with large sample sizes, (3) the discrepancy is less when 
p l and p t are near 0*5 than when one or both is very small, i.e. when we are in a comer of 
the pi,p 2 -space, (4) the approximation is better for a — 0-10 than for a = 0-02. 

Mathematically, the approximation is of course not very accurate even with two samples 
of 30. But it must be remembered (a) that what is needed in practice is a simple procedure 
for determining quickly the chance of establishing significance, and (6) that errors of 
approximation are not everywhere of the same importance. When the chance of detecting 
a worthwhile difference between p l and p 2 is only 0*40 it is clear that the sample size is 
inadequate and this would still be our conclusion if the approximation gave 0-45. It is 
perhaps only when the power approaches 0-90 that an error of this order becomes serious. 
If the true chance were 0-90 (odds of 9 to 1 ) and the approximation gave 0-95 (odds of 19 to 1 ) 
the result becomes somewhat misleading on this basis. Examination of Tables 3(a) and 
2(6) suggests that if one value of p is likely to be less than 0-1 or greater than 0-9, the 
approximation is failing us for n < 30. But even here, as shown on p. 170 below when esti- 
mating the sample size needed to provide a given power, using Tables 4 and 5, we shall not 
be far out. 


6. Case of equal sample sizes 

Suppose m — n = \N. Then fi x = n{p x +p 2 ). Substituting in (6) and replacing N*(N — 1) by 
N 3 ( = 8n 3 ), the error being negligible when n is not too small, we find 

%q) = (2-pi-p a )]log e |^j-^|. 

In the case of the one-sided test, where the alternative is p x > p 2 , h (/q) is positive. In the 
other case with alternatives Pi<p t or Pi>p t , A(/t x ) is negative or positive. But from the 
expression (15) for the approximate power, it is seen that the sign of A(/i x ) is immaterial. 
So, putting 1 6(^)|, 

k = k(p v p t ) - ^SPi+Pi)( 2 ~Pi-rPi)] log 4i(l -pj| 
h = kyjn. 


we have 


(17) 

(18) 




166 


The pouter function in a 2x2 table 

From (15) and (16) it follows that to the first approximation, for given n *■ and a, the 
power, /?, is a function of k only, i.e. 

fi(PvPt I n, n, a) « /3(k | », a). 

These contours of constant k have been drawn in Fig. 3 and the values of k are written 
alongside. Thus for m = n the values of p v p a and the power of the test may be linked up 
as follows: 

(i) Fig. 3 relates p v p a to k. 

(ii) Equation (18) gives h in terms of k and n. 

(iii) The normal integrals (15) and (16) give the power in terms of h and the significance 
level a employed in the test. 

It will be seen that to the first approximation the composite hypothesis, 

H 0 (p x = p t - p, unknown), 

is reduced to a simple hypothesis, k = 0. So the test in this case may be regarded as the test 
of the hypothesis, k = 0, with alternatives, k > 0. 

7. Tabulation 

(1) Tables of k as an alternative to Fig. 3. From (17), k has been calculated for 

Pi, P» = 0-05(0-05) 0-95 

and is given in Table 4 (printed at the end of the paper, p. 174). Since k(l —p v 1 —p 2 ) has 
the same value as k(p v p a ) the figures in the upper part of the table (p 2 > p 2 ) are not printed. 

(2) Table 5. The power as a function of h = k^Jn and a. For the two-sided test we require 
the integral (15). Denoting this by P, i.e., 

(1#) 

the value of P is given in columns 2-5 of Table 5 (printed at the end of the paper, p. 175). 
It has been calculated from Tables of Probability Functions, Vol. ii (Federal Works Agency, 
New York), using Lagrangian four-point interpolation, for the levels 

a = 0-10, 0-05, 0-02 and 0-01 and for h = O-l(O-l) 3-0(0-2) 5-0. 

As h increases, the contribution to this integral from one tail rapidly becomes negligible, 
so that (19) approximates to i /•«> 

-,. 0 , I 

V( 2 *)J Ut a -h 

From (16), we see that this integral is the power of the one-sided test, applied at the 
significance level $a. The points at which this result is obtained to a given level of accuracy 
are shown in Table 5; at least two-decimal accuracy occurs below the mark *, three- 
decimal accuracy below f, and four below J. 

To facilitate the use, values of k — hj^n have been tabulated in the right-hand side 
of Table 5 corresponding to the h values in the first column and for 

»= 10(5)60(10)100,150. 

Linear interpolation is adequate for Tables 4 and 5, except in the comers of Table 4 
where Lagrangian four-point interpolation may be necessary. 



16? 


P. B. Pat* aik 

8. Application of the power function and use of tables 

Illustration 1. Laurence & Newell, while experimenting with the composition of soil 
composts, reoorded the following results of a germination trial with Primula sinensis seeds 
(quoted in Statistical Analysis in Biology by K. Mather, 1946, p. 193). Two equal groups of 
seeds were allowed to germinate in dishes containing filter papers soaked' respectively in 
rain water and in water allowed to seep through loam before use. 


Table 6 



Germinated 

Ungerminated 

Total 

Loam water 

37 

13 

60 

Bain water 

32 

18 

50 

Total 

69 

31 

100 


To test if the type of water affects germination, u has been calculated to be I'll and 
referred to the normal probability scale. It is seen that there iB no significant difference 
at the 6 % level. 

It might be asked what magnitude of difference could we hope to detect using two 
samples of 60. Suppose, for example, that for these populations 80 %, say, of seeds will 
germinate in loam water and 60 % in rain water; what would have been the chance of 
establishing significance at 6 % level? 

To obtain this, we find from Fig. 3 or from Table 4 the value of k for^ = 0'8 andp t = 0-6. 
From Table 4 it is seen to be 0*318. Then we enter Table 6 in the column of n — 60 and 
find that this value of k lies between the tabulated values, 0 31 1 and 0*326. They correspond 
to the figures 0*6949 and 0*6331 in the column of P for a = 0*06. So, the first approximation 
to the power lies between these values and by linear interpolation is found to be 0*61. If 
the level chosen for the test is a = 0*01, we find that the power is only 0*37. Clearly this 
indicates that with two samples of 60, there is a very considerable risk of failing to establish 
significance when the difference in chances of germination is of this order. 

Suppose now we asked how large the samples should have been to give a chance of 0*9 of 
establishing significance when the true percentages germinating are 80 in loam water and 
60 in rain water? We then proceed as follows: In Table 6, entering the column of P under 
a = 0*06, we find that 0*90 lies between 0*8925 and 0*9251 and following the rows of these 
figures we see that k = 0*318 lies between the figures in the columns of n = 100 and n = 150. 
As the interval is too wide for interpolation we find h in column 1 corresponding to P = 0*90 
in column 3 and then from the relation, n = &*/&*, we obtain n to be nearly 105. If the level 
chosen is a = 0*01, the samples should be of size 150 to give the same power. 

Illustration 2. In the early stages of production of an important piece of electrical 
equipment, it has been found that the percentage of units failing under test varies, according 
to the batch, between 30 and 40 %. An adjustment is suggested whioh would be considered 
worthwhile if it leads to a 60 % reduction in failures. A trial is planned in which N — 2n 
units are selected at random from a batch, half are adjusted and half not and the whole are 
then tested. We want to know in advanoe about how large should » be so that the odds are 















168 


The power function in a 2x2 table 

19 to 1 that we shall not reach an inconclusive result (at the 5 % level) if there has been 
a 60 % reduction in failures. 

Here p t — 0*30 to 0-40 andp 2 = 0-16 to 0-20. From Fig. 3 we see that the oontour, k = 0-3 
roughly passes through the range of these points (p v p t ). From Table 6 we find h = 3*3 for 
P = 0-96 and a = 0-10 (since the test is of the one-sided form). So, n - h % fk a = 120, nearly, 
a surprisingly large number. 



Fig. 3. Contours of constant k for use in determining the power when m=n. 
(Values of k are printed alongside the curves). 


Sometimes, without precisely specifying p 1 or p 2 we may consider an improvement 
worthwhile if it increases the percentage of effectives by a fixed amount, e.g. such that 
Pt~Pi = 0*26. Table 4 shows how this difference is roughly constant for k constant in the 
central area; for example, 


Pi 

Pt 

k 

0*20 

0-45 

0*393 

0-25 

0-50 

0*370 

0-35 

0-00 

0*302 

0-45 

0-70 

0*300 

. 0-55 

0*80 

0*393 


That is, the contour k = 0-38 passes very closely through these points (p v p t ). With this 
value of k we obtain n as before. 

Illustration 3. If we have a random variable x following a distribution law which is 
approximately normal, the most efficient estimator of the population mean, /*, is the sample 



P. B. Patnaik 


169 


mean x. On the assumption that the variance is not changing appreciably, we should use 
the £-test to determine whether there has been a change in p between the drawing of a first 
and second sample. Practical requirements sometimes make it preferable, or even neoessary, 
to observe only the number of individuals in the two samples, say a and 6 respectively, for 
whioh x falls below a fixed level x 0 . We can then test whether there has been a change in p 
with the help of the ratio u of equation (1), and p % being the chance that x^x 0 for the 
first and second samples, respectively. 



Fig. 4(o), (6). 


Examples of this problem occur: 

(i) In firing two types of shot, with striking velocity x 0> against a standard proof plate 
and observing the numbers, a and 6, that fail to perforate. 

(ii) In dosage-mortality problems, where two drugs are compared only at a single dosage 
level, x 0 . 

The value of x 0 will frequently be at our choice and the generally accepted principle is to 
take a value so that P{x x 0 } is in the neighbourhood of 0-5. It is possible to confirm the 
soundness of this procedure in terms of the power function of the u (or x*) test. Fig. 4(a) 
represents two normal distributions with means p x and p t and a common standard deviation, 
o’. Pi and p t are the proportionate areas under the curves below x = x 0 . For a given relative 
shift in mean, 6 = (p % — p^jo, values of and p t as functions of x 0 are readily found from 
tables of the normal probability integral. Fig. 4(6) shows, as solid line curves, the locus of 
points p x ,p t for which 8 =* 0-5, 1*0 and 1*5. Our problem is to determine, for given 8, the 





170 


The power function in a 2x2 table 

value of x 0 which will maximize the chance that the u-test will establish significance. We 
know that in large samples of equal size (m = a), the power of the test is constant on the 
contours of oonstant k {equation (17)). Two of these contours, for k *■ 0*20 and 0*63, 
approximately, are shown as dotted lines in Fig. 4 (6); they touoh the curves for d = 0*6 and 
1*6, respectively, at the points where p x +p 2 — 1, and elsewhere fall beyond them, as 
shown. 

Since for small samples, we have found that the ^-contours rather overestimate the power 
when p x oxp 2 approach 0 or 1 , it follows that the true power contour whioh touohes a 6-curve 
where p x +p t — 1, will fall even further outside it, towards the comers of the diagram, than 
the k-oontours, as drawn. It appears therefore that: 

(i) For a given 6, the usual test for a difference in proportions has a maximum chance of 
establishing a difference if x 0 = $(/*i +/**) and this is s<5 for all levels of the power function.* 

(ii) When m — n are large and the true power contours become exactly those of 
k — constant, shown in Figs. 3 and 4(6), p x +p 2 may differ considerably from 1*0 without 
appreciably reducing the power of the test. 

9. The error in estimating the sample size necessary 

TO ENSURE A GIVEN POWER 

In section 6 it has been seen that the power obtained by the first approximation is generally 
an over-estimate of the true value. The effect of this will be to underestimate to in carrying 
out the procedure illustrated in the previous example. It is possible to determine the 
magnitude of this error in the neighbourhood of to = 16 and 30, where the exact values of the 
power have been found and are given in Tables 3 (a) and (6). Some results of this comparison 
are shown in Table 7. 


Table 7. Comparing the estimates with the true values of the sample size, n 


m — n 

Pi 

Pt 

True power* 

k from 
Table 4 

h for true power 
from Table 6 

Estimate of 
n = /t*/* 1 

For 

a = 0*10 

For 

a = 0*02 

a = 0*10 

a = 0*02 

a = 0*10 

a = 0*02 

16 

006 

0*6 

0*916 

0*736 

0*930 

3*03 

2*96 

11 

11 


01 

0*6 

0*919 

0*739 

0*878 

3*06 

2*97 

13 

12 


01 

0*7 

0*980 

0*894 

1*066 

3*71 

3*68 

13 

12 

30 

0-06 

0*3 

0*884 

0*631 

0*663 

2*84 

2*66 

26 

23 


01 

0*6 

0*977 

0*902 

0*712 

3*66 

3*62 

27 

26 


0-2 

0*7 

0*993 

0*966 

0*786 

4*12 

4*16 

y 

28 

28 


* The power of the two-sided test. 


For example, if p x = 0*1, p 2 = 0*6, Table 3(a) gives the true power for to = to = 16 as 
(i) 0*919 using the test with a 10 % significance level and (ii) 0*739 using the test with a 2 % 
level. Suppose now that we were to use Tables 4 and 6 to estimate how large the sample 

* For m = n = 30, Table 5 shows that for k =■ 0*20 and for a significance level a, = 0*06, the two-sided 
test has a power of a little tinder 0*20 and for k = 0*63 of a little under 0*93. 





P. B. Patnaik 


171 


must be to give ohazfoes of 0*919 and 0*739 of establishing significance at the 10 and 2 % 
levels respectively, when p t =• 0*1, p t =»0*6. From Table 4 we find that k * 0*878 and 
interpolating in the second and third columns of Table 5 we obtain h and so m, thus: 


p 

a 

h 

n = h*/k* 

0*819 

0-10 

3*05 

121 or 13, 

taking the next higher integer 

0*739 

0-02 

2*97 

11-4 or 12, 




taking the next higher integer 


The Table shows the extent of the underestimate of n. The error will become relatively 
smaller as » is increased; but some adjustment is clearly desirable. 

10. A SECOND APPROXIMATION TO OVER-ALL POWER 
The first term in the right-hand side of ( 14) has been taken as the first approximation to the 
over-all power. The other terms decrease fast since the higher derivatives, P"(Pi),P ti (Pi),..> 
rapidly approach zero, and a good approximation may be obtained by taking a few of these 
terms. We will, however, derive a second approximation by a slightly different approach. 

The method of approximate product-integration developed by R. E. Beard (1947) could 
be employed to express /? as a weighted sum of terms fl(r n ) where r v r t , ...,r n are n values 
which might or might not be fixed beforehand. Considering only three such levels of r, we 
write formally Pco /*oo 

P(r)p{r)dr = {a^rj + aiftrj + atftra)}] p(r)dr. (20) 

Expand the functions fi{r), p{r{), fl(r 2 ) and fl(r 3 ) by Taylor’s Theorem to six terms as in (13) 
and write the remainder terms after the sixth. Then identifying the coefficients of 

PUh). 

on both sides of (20), we have the equations a 1 +a 2 +a i = 1 

®i( r i — Pi ) 1 + a i( r » ~ Pi ) 2 + «a( r s ~ Pi ) 2 — p% 


^i-Ptf+^i-Pif+aiiTi-Pi? = 0 . 

Expressing the higher moments of the normal distribution p(r) in terms of p t , these equations 
yield the solution: 0j _ 0j = a 8 = f . 

r l ~ Pl~ \l(&Pi)’ r i = PV r 3 = Pi + V(®/*i)* 

Since the left-hand side of (20) is /?, it follows that 

P = kP[pi ~ V(¥s)] + iP(Pi) + IPlPi + 

where R depends on the four remainder terms of the expansions fi(r ), . 


( 21 ) 

,P(r 3 ). It can be 


shown that 


R 




no 

6 ! 


( — 00 < £ < oo). 


We may now compare the expression for p in (21), with that in (14). Expanding the two 
functions P[p 2 — <J(dp 2 )] and P[p 2 + >/(3/t a )] by Taylor’s Theorem, we see that the right-hand 

side of (21), without R, includes the terms /?(/q), — *|y^ - p 2 , — - - p- p t of (14) completely 










172 


The power function in a 2x2 table 

and the following terms partially. This is also seen by comparing the form of B with the 
form of the remainder term of (14) after the sixth, R t = yff <#) (£)/*«/6! • Clearly, 

= 4te-V(3^)]+«y?W+i/?l>i+V(¥2)] ( 22 ) 

gives a better approximation than 


As it is also easier to calculate, we will regard (22) as our seoond approximation to the 
over-all power function. 

We can get a similar formula with equal weights, namely 

Unlike (22), this is derived by using only the first three moments of the distribution of r. 
More generally, using only the first three moments, we have 

fi = 2^{$>i - c VM + PlMi + c V(/'a)]} + ( 1 “^) flifii)- 

where c can be chosen as we like, subject to the restriction that none of the arguments 
should become negative. 

In section 8 we have seen how the over-all power for equal samples could be obtained to 
the first approximation, /?(/*i). For the second approximation (22) the values of fS{[t x + 
and f}\jii — *J(S/i 2 )] may be obtained in the same manner by entering Table 5 with the values 
of h\ji 1 + V(3/*a)3 and h\jii — V^Pa)]- From (6), with m = n, we have 

%ii + V(.3/*»)] 

= J^{(MPi+Pd + ( 2 » - nPx +P 2 ) ~ *J[3n(Pi<h+Pzqi)]} • 

Putting hi = | h\jii + ^/(3/4 b )] | , this becomes 

2(1 -pi -p 2 ) JlMPiqi +p a g g )3 ~ Mptf! 

n{Pi+Pi)(Z-Pi-P2) /’ 

where k is the expression (17). Similarly if 

h 2 — | h\jii — i] (3p a )] | , 

— 2(1 —Pi—Pj) V[3w(p 1 g 1 + p 8 g g )3 ~ 3n(pi g x + p 2 g 2 ) \ 
MPi+p 2 )(2-Pi-Pt) ) 

and h 2 can be calculated by obtaining k from Table 4 and substituting the values of k, p v p 2 
and n in these two expressions. 

If P v P 2 are the tabulated values of the integral P of ( 1 9), corresponding to h v h 2 and if 
P is the value corresponding to h(pi), then the second approximation is 

p=\{Pi + P 2 + ±P). (23) 

The value of n obtained from Table 5 as described in section 8, may be improved with the 
help of the second approximation. Taking this value of n we calculate and h 2 and obtain 
the corresponding P x and P 2 . Then from (23), 




With this P we enter Table 5 again and find the improved n. 



P. B. Patnaik 


173 


The values of the over-all power have been calculated on this second approximation for 
certain oases and shown alongside the first approximation values in Tables 3(a) and 3 (6). 
It is seen that the seoond approximation improves the first, although it does not remove 
all the error. 


11. Genebal case of unequal sample sizes 
The first approximation to the over-all power depends on 


Hp i) 


-71 


mn(mp 1 + np 2 ) (N — mp 1 — np 2 ) 
N* ’ 


}l°ge 


M? 


(24) 


taking N a instead of N*(N — 1) in (6). 
Suppose 0 = m/n then 


h = 


i*wi-7i/{ 


20 

(1 + 0)* 


(0Pi+Pt)(l + 0l-Pi-Pt) 



Ptfi 


= <J($N)k(0), say. (26) 

k(6) corresponds to k in the case of equal samples, but in general it is a function of 6 as well 
as p x and p 2 . If 0 is known, as for example, when m and n are given, the k function defines 
as before a family of contours of constant power in*the p x p 2 plane. 

To obtain a first approximation to the power, we evaluate h(/i x ) from (24) or (25) and enter 
Table 5. For the second approximation we have to evaluate also h[/i x ± <J(3/i 2 )] and as before 
obtain the weighted sum of the corresponding P’s. 

The converse problem of determining the sample sizes for a given power can be solved 
only if the value of 0 is given. For example, we may ask, ‘What sample sizes should we take, 
one being double the other if we want a 90 % chance of detecting a significant difference 
when p x and p 2 have certain specified values ? ’ 

First we calculate k(0). From the given power we obtain h from Table (5). ThenN = k 2 (8)lh 2 
will give N to the first approximation. Since 6 is given, m and n are obtained. 


12. Summary 

The power function of the test of significance for the 2x2 table has been considered and an 
approximate method of deriving it has been developed. The usefulness of the idea of power 
in fixing in advance the sizes of samples is indicated. 

Tables have been provided for determining the power for a specified alternative, the 
samples sizes being given and conversely. 

I wish to express my grateful thanks to Prof. E. S. Pearson and Dr H. O. Hartley for their 
guidance in the study of this problem. 


REFERENCES 

Barnard, 6. A. (1947). Biometrika , 34, 123. 

Beard, R. E. (1947). J . Inst . Actu . 73, 356. 

Neyman, J. & Pearson, E. S. (1933). Philos . Trans . A, 231, 289. 
Neyman, J. & Tokarska (1936). J. Atner, Statist . Ass . 31, 318. 
Pearson, E. S. (1947). Biometrika , 34, 139. 

Tang, P. C. (1938). Statist. Res. Mem. 2, 126. 



*d jo Nn]«A 

O OOWO^OiOQOQlOQOO 

*m < 6 <£< d > d > 6^<£< £<£<6 ‘ ‘ 


10 0*0 
*-4 O 

<£ £ <£ db db db <£ db 



? 

^■4 

8 

<6 

8 

<£ 

<6 

8 

<6 

»o 

r- 

o 

o 

l— 

<£ 


<£ 

8 

o 


lO 

lO 

<£ 


8 


3 

<£ 


8 

o 


lO 

CO 

<6 


CO 

<£ 


io 

04 

<6 


8 

<£ 


io 

•■h 

o 


o 

f“H 

d 

§ 

<5 


Values of p : 



Table 5 . Relating Power and k , h,n and a 



2222 2522 S SSSSS SSSS8 SSSSS % t%$Z 


sisss mm mm §§gf 


is mm mm mm mm mm nm\ 
** ***** ***** ***** ***** ***** ****2 


* SltfS 


g §858 

sssss 


f gS 55 § 
6666 

I 3 §SS 

sssss 

66666 


HHHHH 

***** 


98833 

***** 


ssssa 

Hi " 1 HH (M 
66666 


mm 

66666 

SS8SS 

66666 


sra§ mm 
66666 66666 


6 6 666 


66666 66666 66666 


66666 


eoeortcpcp 

66606 


mm min 
66666 66666 


28 SSSCSS 

iiisl 


66666 


Tr TT "Sf 
66666 


iflfl 
6 6.6 6 6 


f»S§ 

66666 


Hill 


§§§8$ IsS&i 

66660 66666 


coco 

66666 


66666 


3 i 23 $£ 

»o«o ©99 

66606 


§ 885 $ 

S 3 coo 
66666 


gists 

06066 


t- t '— GO 0 

6666 c 




28 SSS 8 

HHHHH 

66666 


&© 5 e 5 oSei 

66666 


ifpi 

6 66 6 6 


66666 


£8383 

60606 


M^Tfg trcoocoeo 

hisses 

50006 66666 


§§§§£ 

66666 


66666 66666 


sim 

66606 


66666 


IISIS 

66666 


3 fc £33 

66660 




ISSS§ 

66666 


t— t— QO 00 do 
66666 


6ohhh 


SSi88 

***** 


***** 


2&B2S 

40 

66006 


SSgag 

06666 


CooQhc 

i$!s* 

6060- 


§§pi 

66666 


2 g 332 

CO CO ^ ^ ^ 
66666 


§ists iiip iisp mm 

5O666 66660 HHHHH HHHHH 


2 ssssf 


§§§§§ 

00666 


66660 


6666 © 


m%n 


§1111 

66666 


66666 


60666 

iifli 

66660 


2oo66 

■♦Hi 

t-o*^ g g 

SSSScp t- 
66666 


66660 


sssss 


0660c 


jJL. HHHHH 

® ***** 

^~7 l 


ssssss 

o**** 


sssss 


pt«appp Hipeo^p 

6 6 6 6 Hi HHHHH 


66666 66666 

Hcpeo^p ptnappp 

HHHHH HHHH W 


tisil 

s***i 


sssss 


sssis 


50660 oc 


5 t-ODO >0 w^ppp p^ipopp 

tieaecieico eocococont* -***<< ^< ■>*< 6 












[ 176 ] 


THE ANALYSIS OF CONTINGENCY TABLES WITH GROUPINGS 
BASED ON QUANTITATIVE CHARACTERS 

By F. YATES 

A p x q contingency table can be tested for independence by a x 2 test with (p — 1) (q— 1) 
degrees of freedom. This is an over-all test which covers all forms of departure from 
proportionality, and is consequently correspondingly insensitive to departures of a specified 
type. 

If the nature of the data is such that departures of a particular type are to be expected, 
then a test of significance appropriate to departures of this type will be justified. 

The present paper deals with the case in which one or both groupings are based on 
characters which are either directly quantitative or are in the form of gradings which can be 
regarded as having an underlying quantitative basis. The actual data winch gave rise to the 
investigation are shown in Table 1 . They were obtained in the course of a pilot inquiry into 
the conditions in which school children do their homework, carried out by the Department 
of Social Science, University of Liverpool, and I am indebted to Mr D. Chapman for 
permission to reproduce them here. 


Table 1 . Relation (in terms of numbers of children and percentages) between conditions under 
which homework was carried out, and the teacher's rating of the quality of that homework. 
(Each scale is graded, A being the highest rating) 


Teacher’s 

Homework conditions 

rating 








A 

B 

0 

D 

E 


A 

141 (46%) 

67 (46%) 

114 (39%) 

79 (44 %) 

39 (43%) 

UBI 

B 

131 (42%) 

66 (45%) 

143 (48%) 

72 (40%) 

35 (39%) 

447 (44%) 

G 

36 (12%) 

. . 

14 (9%) 

38(13%) 

28(16%) 

16(18%) 

132 (13%) 

Total 

308 (100%) 

147 (100%) 

295 (100%) 

179 (100%) 

90 (100%) 

1019 (100%) 


It is clear from the percentages that the effect of homework conditions on the quality of 
the preparation, as judged by the teacher’s rating, is small. On the other hand, there is some 
slight trend, and the question therefore arises whether this trend has any significance, or, 
more generally, what is its estimated magnitude, and what are the errors of this estimate. 

A x 2 test of the whole table gives a value of x 2 equal to 9-16 (8 degrees of freedom), but 
such a test, as pointed out above, embraces all types of deviation from proportionality. 

In material of this kind a rough test, which isolates the major part of any quantitative 
association between the two variates, can be made by reduction of the table to a 2 x 2 table 
by the grouping of appropriate parts and rejection of others. With the above data we might 
reasonably group homework conditions A and B, and D and E, rejecting condition C, and 
also rejeot the teaoher’s central rating B. This will give the values of Table 2. 








F. Yates 


177 


This Table gives a value of x5, i.e. x* corrected for continuity, equal to 3*03 (1 degree of 
freedom), corresponding to a probability (for one tail of the Xc distribution) of 0*041. (The 
use of a single tail is appropriate to testing whether there is evidence that improved condi- 
tions of homework improve the quality of preparation, ruling out the opposite contingency.) 
The test therefore indicates that there is significant evidence (at the 5 % level) of some 
improvement. 

The above rough test, however, is open to several objections. In the first place, since the 
grouping is arbitrary, there is always a possibility that the statistician will allow himself 
to be influenced by the data in his choice of grouping, a grouping whioh gives a high degree 
of association being chosen. If this ocours, the test of significance is dearly vitiated. Secondly, 
different workers may in good faith choose different groupings of the same data, and this 
may lead to arguments of a type that are likely to discredit the science of statistics. Thirdly, 
the choice of grouping which is most appropriate in any given case depends on the marginal 
totals of the numbers of observations, and simple rules for the choioe of grouping are not 
easy to devise. Fourthly, the test provides no estimate of the magnitude of the effects of 
the association. 

Table 2. Condensation of part of Table 1 




Homework conditions 


Teacher’s 




rating 

A+B 

£>+JS7 

Total 

A 

208 (81 %) 

118 (73%) 

326 (78%) 

O 

50(19%) 

44 (27%) 

94 (22%) 

Total 

258 (100%) 

162 (100%) 

420(100%) 


The test based on regression concepts, developed below, eliminates the element of choioe, 
requires no elaborate computation, and also provides estimates of the magnitude of the 
effeot of each variate on the other. 

The following notation will be adopted. Letters without dashes will be taken to indicate 
the results of operations on the rows, and with dashes the results of the same operations on 
the columns. The r row totals will be denoted by N lt N t , ..., the r' column totals by N' v N' 2 , . . . , 
and the grand total by T. We shall also require an extended summation notation, in which 
the z’s may denote any set of r quantities, as follows, 

S 0 (x) = * 1 +x a +... + a: r , 

Si(x) = -$(»•- l)*i -£( r — 3)3!— ... + $(r-l)x r , 

£ 8 (*) = i(»" - 1 )* *! + i(r - 3)* ar 2 + . . . + J(r - 1 ) 8 x r , 
with three similar functions extending to r' terms distinguished by dashes. 

For each of the r' columns a mean score is calculated, assigning a score of — $(r — 1) to the 
first row, — — 3) to the second row, etc. and + J(r — 1 ) to the last row. If the numbers in 
the different sub-groups of the first oolumn are n x , » 8 , . . . , n r , the mean score for the column is 

u '\ ~ jpS i(*)» 

u[ - iw»). 


and the total score is 

Btometrika 35 


13 















178 


The analysis of contingency tables 

If the numbers n lt n 2 , ...,n r are regarded as a sample from a multinomial distribution 

with probabilities of, say, p v p t p r , the variance of n x will be N^p^, and the oo- 

varianoe of » x and n t will be -N' 1 p 1 p 2 , etc. The variance of u[ can then be simply 
calculated, and will be found to be 

VM) = ±-,[S i (p)-{s 1 (p)n 

The regression of the mean scores u x> u ' % , ..., u' r > for the r’ columns on column number can 
now be calculated, weighting according to the variances of the u n s, provided we can make 
some assumption as to the values of the p’s. 

In order to obtain a test of significance we may start with the hypothesis that the sets of 
p’s are identical for all columns, i.e. that p x is the same from column to column, etc. Then 
the variances of the u”& are inversely proportional to the numbers in the columns, i.e. the 
N” s, and in calculating the regression w r e may therefore take weights equal to the N n s. 
Estimates of the values of p’s may be derived from the row totals, so that p x = NJT, etc., 
when we shall have 1 

W) = N[T* [T8 * {N) " 

etc. 

If the regression equation is taken in the form 

m' = m' + b’{x - i(r + 1 )}, 
the equations of estimation for m' and b' are: 

m'S’ 0 (N’) + b'SKN') = S’( U'), m'S[{N') + b'S' 2 {N') = SJ ( V). 

We have S 0 (N) = S’ 0 (N') - T, S'(U') = 8,(N), 

S 0 (U) = SUN’), S^U) = SUU’). 

If we put A _ TS2{N ) - {SUN)} 2 , 

A ' = TS' 2 (N')-{SUN')}\ 

B= TS 1 (U) — S 1 (N)S' 1 (N'), 

the solution of the above equations gives b’ = B/A', with the corresponding regression on 
row scores b — B/A. 

The variance per unit weight is A IT 2 , and consequently 

vth'\ _ y A _ A 

' ' A'T* A’T' 

Similarly, V(b) — A' I AT. The appropriate test of significance is therefore given by 

6 2 b' 2 B 2 T 

X ~ V(b) ~ V(b’) ~ AA' 

with one degree of freedom. 

The above test is unaffected by interchange of rows and columns, as should be the case. 
It can easily be verified that the test reduces to the ordinary y 2 test of a 2 x 2 contingency 
table (apart from the correction for continuity) if the values in all but two of the rows and 
two of the columns are put equal to zero. The correction for continuity is not here of 
importance in view of the large number of possible alternatives with given marginal totals. 

The test has been arrived at by assuming that the column totals only are fixed, but with 
the additional approximation involved in assuming that the p’s for each column are given 
by the marginal totals for the rows. This approach has the advantage of indicating the 



F. Yates 


179 


appropriate criterion B*TjAA' for the type of association which it is desired to test. Once 
the criterion is determined, however, it can be shown, following Fisher (1922, 1925), 
that this criterion, which is linear in the deviations from expectation, and orthogonal with 
the linear functions representing the row and column totals, will in large samples be 
distributed as x 2 with 1 degree of freedom when both sets of marginal totals are held fixed. 

The values for the regression coefficients b and 6' give estimates of the change in mean 
score with unit change of row and column respectively. It should be noted that the variances 
of b and 6' given above are based on the assumption that there is no association between the 
two variates and become progressively more inaccurate as the degree of association increases. 
In general they will be over-estimates of the true variances. Nothing more accurate is likely 
to be required, however, except when there is very marked association, in which case the 
assumption of linear regression in the mean scores over the whole range is unlikely to 
provide an adequate mathematical description of the association. 

If only one of the classifications, say the rows, is quantitative, a test for the homogeneity 
of the mean scores for the columns may be derived from the variances of these mean scores. 


The quantity 


Q « S' r L 

_t*\ 

~ A 


'M \$i{n'/V(u')}]* 
«')/ Si{ilV(u')} 


\V(u 

[•d 


muw 

T 


I 


/T72 

=~mwu')- 


u'S' 0 (U')l 


where u' is the mean score for all the observations, will be distributed as x 2 with r' — 1 degrees 
of freedom. It is easily verified that t his test reduces to the ordinary x 2 test for a 2 x r' table 
when there are only two rows. 

It may be noted that any system of scoring may be assigned to each of the classifications — 
there is no need to adopt a system with equal intervals between each class if the nature of 
the classification is such that scoring with unequal intervals is more appropriate. If, for 
example, in the data of Table 1 the teachers had given their opinion that the difference 
between their classes A and B in quality of preparation was only half that between B and C 
scores of -f 1 , 0, and — 2 might have been adopted. In the absence of any such indications, 
howe ver, the appropriate procedure will be to assume that graded classifications are intended 
to represent equal intervals on some scale, unless the data themselves are used to determine 
the optimal system of scoring, by some procedure analogous to that given by Fisher (1946), 
para. 49-2, or by reference to some assumed distribution of the scores in the population. 
Examples of the latter procedure are described by Pearson & Moul (1925), and also in 
K. Pearson’s Table# for Statisticians and Biometricians , Part it, pp. xxiii-xxvi (1931). In 
this connexion see also E. S. Pearson (1923). Such procedures, however, are only likely to 
be worth while in exceptional cases, and with very extensive data. 

The computational procedure when the scores are assumed is very simple, and is illustrated 
in Table 3 for a 4 x 5 table with quantitative classifications. The scores used for the rows 
have been multiplied by 2 for convenience in computation. The value of b obtained will 
therefore represent half the change to be expected in the mean score from row to row. 

The total scores U and U f in Table 3 are calculated by summing the products of the scores 
and the numbers in the corresponding sub-classes. Checks are provided by carrying out 
the same operations on the totals. S 2 (N) and S % (N ') are obtained by multiplying the totals 


X3-2 



180 


N v N t , .... and N{, N ' t , ..., by the squares of the soores. These must be oheoked. S^U) is 
obtained in two ways by summing the products of the U'b and the U”b with their 
corresponding soores. 

The quantity A is then obtained from the last three values in the total column, A ' from 
the last three values in the total line, and B from the oross product of the 2x2 table formed 
by the total and total score rows and columns. 


Table 3. Computational procedure 



(Score)* 

4 

1 

0 

1 

4 




Score 

— 2 

-1 

0 

+ 1 

+ 2 



(Score) 1 

Score 

A 

B 

c 

D 

E 

Total 

Total score 

0 

-3 A 








1 

-1 B 

— 

— 

— 

— 

— 

N t 

U t 

1 

+ 1 o 

— 

— 

— 

— 

— 


U» 

9 

+ 3 D 

—— 

— 

— 

— - 


N* 

U t 


Total 

u[ 

K 

K 

K 

K 

T 

S„(U) = 5,1 


Total score 


v' t 


ul 


S l (C/) = S' 1 (C/') 








S t (N) 

\ 

•B 








t 









A 



Table 4. Analysis of the data of Table 1 
Homework conditions 



(Score)* 

4 

1 

0 

1 

4 






Score 

Teacher’s 

+ 2 

+ 1 

0 

-1 

— 2 


Total 

Total 

Mean 

(Score) 1 

Score rating 

A 

B 

C 

D 

E 

Total 

score 

(score) 1 

score 

1 

+ 1 A 

141 

67 

114 

79 

39 

440 

+ 192 


+ 0-44 

0 

0 B 

131 

66 

143 

72 

36 

447 

+ 186 


+ 0-42 

1 

-1 O 

36 

14 

38 

28 

16 

132 

+ 26 


+ 0-20 


Total 

308 

147 

295 

179 

90 

1019 

+ 404 

1918 

+ 0-40 


Total score 

+ 106 

+ 63 

+ 76 

+ 61 

+ 23 

+ 308 

+ 166 




Total (score)* 






672 





Mean score 

+ 0*34 +0-36 +0-26 +0-28 +0*26 

+ 0-30 





The analysis of the data of Table 1 is given in Table 4, which also shows the mean scores 
(not required in the analysis). From the values of this table we have: 

A = 1019 x 672-308* = 488,004, 

A’ = 1019x1918-404*= 1,791,226, 

B = 1019 x 166-308 x 404 = 44,722, 
b = BjA = 44,722/488,004 = 0-09164, 

V = BjA' = 44,722/1,791,226 = 0-02497, 



F. Yates 


181 


s.b. of b 


8.E. of b' -- 


‘Jlf’J 

' Jat'JtI' 


1,791,220 
488,004x1019 

488,004 


791,226x1019 


± 0 - 0000 , 

« ± 0-0163, 


8 &T 44,722* x 1019 
* “ AA' “ 488,004 x 1,791,226 


2-332. 


Reference to a table of the normal integral indicates that P =* 0-062 (one tail), and there 
is therefore a probability about 1 in 16 of obtaining by chanoe as great an apparent improve- 
ment as is indicated by the data, if homework conditions have in fact no influence on the 
quality of the homework as shown by the teacher’s rating. More important, in data of this 
kind, the analysis indicates that the amount of improvement, if any, is likely to be small. 
In terms of the improvement to be expected in changing conditions from the worst (E) to 
the best (A) the estimated improvement, 46', is 0-100, the limits given by once and twice 
the standard error being + 0-03 to +0-10 and - 0-03 to + 0-23. We may therefore state that 
the true value of this improvement is not likely to be substantially greater than +0-10 and 
is almost certainly not greater than +0-23, nor is it likely to be substantially less than 
+ 0-03, and is almost certainly not less than — 0-03. 

It will be noted that the exact test in this instance gives a lower degree of significance than 
the rough test based on reduction to a 2 x 2 table. This, of course, does not imply that the 
exact test is less sensitive. With any given set of data the verdicts of different tests of 
significance may differ considerably owing to chance causes. This does not preclude the use 
of rough tests when considerations of speed and computational labour are paramount, but 
once the exact test has been evaluated the verdict of the simpler but less appropriate test 
must be set aside. The decision as to the test to be used should also be made without 
referenoe to the actual data — it is, of course, inadmissible to make a rough test, accept its 
verdict if significant, and proceed to an exact test if it fails to indicate significance. 

As an example of the computations when one classification only is quantitative we may 
test the deviations between the mean scores for different homework conditions. This test 
would be required if the homework conditions were qualitative categories and not an ordered 
series. In this case the value of x 2 for 4 degrees of freedom will be given by 

1010 * 

Q = (106 x 0-34091 + ... - 308 x 0-302267) 

488,004 


= 3-826, 

the mean scores being taken to 6 and 6 decimal places to ensure the necessary accuracy. It 
is clear that apart from the linear trend there are no significant variations in mean score. 


REFERENCES 

Fishier, R. A. (1922). J. R. Statist. Soc. 85, 87. 

Fisher, R. A. (1926). Metron, 5, 3. 

Fisher, R. A. (1946). Statistical Methods for Research Workers, 10th ed. Edinburgh: Oliver and Boyd. 
Pearson, E. S. (1923). Biometrika, 7, 248. 

Pearson, K. (1931). Tables for Statisticians and Biometricians, Part II. London: Biometrika Office. 
Pearson, K. & Moux, M. (1926). Ann. Eugen., Land., 1, 1. 



[ 182 ] 


THE PROBABILITY INTEGRAL TRANSFORMATION WHEN 
PARAMETERS ARE ESTIMATED FROM THE SAMPLE 

By F. N. DAVID and N. L. JOHNSON 

1. The probability integral transformation for testing goodness of fit and combining tests 
of significanoe was introduced by R. A. Fisher in 1932. Fisher’s objective was the significance 
of combined independent tests of significance, but his method also proved applicable to 
a certain limited range of tests for goodness of fit as can be seen in K. Pearson (1933), 
J. Neyman (1937) and E.S. Pearson (1938). The transformation may be summarized briefly 
in the following way. Assume that there is a continuous random variable x whose elementary 
probability law is p(x), whence obviously 

r+oo 

I p(x)dx — 1. 

Consider a new random variable, y, connected with x by the relation 

y = j p{x)dx. 

y is a monotonic non-decreasing function of x and 0 j? y ^ 1 . Further, if p(y) is the elementary 
probability law of y, then 

p(y)=p(x)-j^ = 1. 

Hence in the interval [0; 1] all values of y are equally likely, or in common parlance, y is 
rectangularly distributed in the interval [0; 1], no matter what the olementary probability 
law of x. If therefore we have n independent random variables x^j = 1, 2, ...,») following 
a known continuous probability law which is completely specified by H 0 , the hypothesis 
tested, then by means of the transformation 

rxj 

Vj= \_ oo P(Xj\H 0 )dxj, 

the x’a can be transformed into n independent random variables y which are rectangularly 
distributed. 

2. The transformation which we have just summarized is useful statistically in that tests 
based on a rectangular population can be made applicable to any variable of which the 
elementary probability law is known. However, because the parameters of the elementary 
probability law must be specified, it is clear that the range of application of any tests based 
on this transformation will be, very restricted, for cases are rare in statistical practice when 
H 0 is completely specified. It seemed interesting to us to investigate the effect on the trans- 
formation of calculating estimates of the parameters from the data provided by the sample. 
For example, if the mean of the probability law is estimated from a sample of n quantities 
X v X a , . . . , X n each of which is one observed value of n random variables x v x 2 , ...,x n , the 
y’ s obtained by the probability integral transformation will no longer be independent, 
neither will they be rectangularly distributed. We are able to show that the generality of 
the transformation in the case when the parameters are completely specified is lost as soon 
as we begin replacing unknown parameters by the sample estimates, and, as is intuitively 
obvious, the form of the probability law of y depends on the functional form of the common 
probability law of the x’a. 



P. N. David and N. L. Johnson 


18& 


3. We begin by stating the problem in a formal mathematical way and indicate the method 
whereby a general solution is reaohed. Assume that p(x) is a single valued continuous function 
oftheform *<*)-/<* K*. ».), 


where, in the usual way, 0 V d t , .... 6 t , are parameters descriptive of the population, all of 
which may or may not be specified. It may be assumed for generality that none are specified 
and that in place of the unknown parameters, 0, it is necessary to substitute functions of the 
sample values, say, 


F 1 (x 1 ,x 9 ,...,x n ), F 9 (x v x a , ...,x„), ..., F„(x v x 9 , ...,x n ), 

where x v x 9 , ...,x n are the random variables of which the n observations which form the 
sample are the observed values. Thus we require to find the distribution of the variables 

Vi = j*_ J(t | F v F it ..., i 1 ,) dt for i =* 1,2, ...,». 


We have immediately that 


and 


3j/< _ r 

dXj J-ao&dFrdXj 

d ^=f(x i \F v F t ,...,F a ) + 


* ^ 
r— 1 dXj J — oo dF r 

rP 3L 

iJ -co dF r 


• dF 

sp 

r-1 OX, 


dt 


for tVj, 
for i — j. 


In matrix notation we may write this 

[i ^]— mum 

Aj is a diagonal matrix with diagonal elements j{x i \ F v ..., F s ).[dFj/dx k ] is an nxs matrix, 
dFjjdx k being the element in the kth row and the jth column. is an a x n matrix, 


*i 


— oo 3-Fi 


dt being the element in the jfcth row and the jth column. The rank of the matrix 


f 

” j "] is not immediately obvious. In general it may be noted that it will be at least 

n — s, and that it will be less than n. A study of particular cases leads us to believe that where 
the a sample estimates are algebraic functions each of the other, as for example, the sample 
moment coefficients, then there will be s independent relationships between the y’ s, and the 
rank of the matrix will ben — a. Where the sample estimates are not functions of one another, 
as for example in the case of the median and the standard deviation, the matrix rank will 
be between n — e and n. We have not been able to prove this in general but it should not be 
impossible to do so. 

We shall assume that there are a independent relationships between the variables 
y v y t , . ..,y n . Under this last assumption we have 

d(x v ...,x n ) 


^K^n-g+l' 

(provided partial differentiation is, in fact, possible), whence, since 


nM^vK-'F.) 



184 The probability integral transformation 

the joint probability law of y v y t , ••■,y n - s , F v ...,F„ may be written 


p(y»y» •••>yn-t>Fi>F t) ..., F t | d^, d%, 


nm\0iA 0,) -i 


a( * n - +1 ’ 


•••> %n) 


Alternative expressions for the joint-probability law may be obtained by using the 


*-# 




relationship 

Substituting in the right-hand side of the joint law we have 

p{y v Vns> f v f^... 9 f 8 \ e v e * . . . , e 8 ) 

— n //„ I p 'm in II /(#< 

<- 1 / I A* *%> • ■ • • > ^a) <-n-f + 1 


3(*i. - 1 


— Yr* / (^i 1 ^1> ^a> * • * > ^a) ~ tip ip Fir 3* n \ 

“ Biffa | F v F t , ...,F a ) p(Fv 1 •’ F - ' v ^ •’ ° a) 

pfrv -» *»-»> F lt Ft F,\e lt ...,e t ) ' 

!?/(*< I*!.*., 


4. In the previous section formal solutions only of the problem have been set down. For 
any particular case the analysis becomes somewhat complicated. Accordingly, in order to 
obtain a clear idea of the kinds of distributions arising, we shall first confine ourselves to the 
discussion of (a) the special case where only location and scale parameters appear in the 
probability law of the x’s and (/?) the distribution of single y { . Under (/?) we may note that 

y { - J_ J(t | F v F t , . . . , F t ) dt = Z^, F v ...,F t ), say. 

If the distribution of Z i is known, the distribution of y i may be found immediately, and in 
particular if we can write 

. Vi = Z t (x t ,F v ...,F a ) = j ^ g(t)dt, 

then p(y t ) = (pfa))- 1 ^**). 


5. It is not uncommon in statistical practice to find probability laws which are completely 
specified by a single parameter for location and a single parameter for scale. The normal 
curve is, of course, the classic example. Let £ be the parameter of location, or the scale 
parameter, and write p(x)=f(x \£,<r). 


f(xi-i)l<r f(a 

If yi=J_J(t\£><r)M = J_ ee /(*|o,i)<ft-J 




f(t)dt, say, 


then we may write 




Either £, or <r, or both may be estimated from the observed values of the random variables x. 
We treat two distinct cases. 



F. N. David and N. L. Johnson 


185 


Case (»): a known and £ estimated 

. We first suppose that the scaling parameter is known but that it is necessary toestimate 
a central measure of location. Suppose this to be a function M(x v x it ...,x n ) which may 
be written for brevity M(x). We have 


rxi nxi-M(x)]ltr 

y< = f(t\M(x),<r)dt*= f(t)dt. 

J -00 J —00 


( 1 ) 


so 

It follows that 


$%<)• 


x i —M{x) 

<r 

Mm)] - o, 

provided M satisfies the usual conditions for a measure of location.* In this case therefore 
there is one relation between the y/s and j?^ 1 ’ is of rank n — 1 . Using the general 

formula obtained in §3, the joint-probability law of y v y t , ...,y„_ v x > » 


n n* t \M 

pi.yi,y%,-,y n -v x )^ n ^h 

nn*<\z,<r) 

i-1 


The distribution of any individual y { is simply obtained. For, since 


Vi 


-r. 


Ixi-M(x)]l<r 




we have 




and if the distribution of x i — M(x) is known, the distribution of y< follows immediately. 
Case (ii): both £ and or estimated 

Assume that £ is estimated as before by M(x v x g , ...,x n ) = M(x). Since now <r is also 
unknown, suppose that it is estimated from the sample values by a measure of dispersion, 

D{x i,* a x n) = D ( x )- 

Vi - | M{x), D(x)) dt, 

Xj-Mjx) _ 

D(x) ^ Vi ^ 

Provided D(x) is a funotion of the quantities x { — M{x) and satisfies the usual conditions for 
a measure of dispersion, j* the y t ' s must now satisfy the two conditions, 

Mm)] = o; nm)] = i. 

The matrix — ■ ' ’ ’ is then of rank n — 2. The conditions are satisfied, for example, if 

( 1 n U 

r £ (*<-*) a | • By an argument precisely similar to that of 


We have 
and 


case (i) we have that 


, X r ftXi-MWY]- 1 (x { -M(x)\ 

M P \-E<7rh 


* X, ■f O, 4*^} — Xj, . . ^ (XJ Af(x,x,x, -- x. 

t (i) Dix t +a,x t +a,...,x n +a) = Dfa.x 

(ii) D(x,x *) = 0; 

(iii) Dikx l ,kx t ,...,kx u ) = | * | Z>(as x , x„. 



186 


The probability integral transformation 


'Ixt-iHxWZKx) 


J VM 

f(t ) dt. 

— 00 

It iB seen that y t is rectangularly distributed, i.e. p(y { ) *» 1, if and only if 

(x t -M{x)\ 


( 2 ) 




D(x) ;• 

This condition is not likely to be satisfied. 

For both cases (i) and (ii) it may be noted that p(y t ) is the ratio of two probability laws 
with a transformation of variables given by (1) and (2) respectively, and that neither of these 
two probability laws depends on £ or on a. 


6. Example I. Let 


*<*> - -mm- 


and as in the previous section consider two cases. 

For case (i) there are many good statistical reasons for choosing 

M(x) — x 

for the estimate of £ for this probability law. In the notation of § 4 


Zj — 


Xi — X 




( 3 ) 


and pfZLjfj =p(z i ) = 

Applying the results of the preceding section we shall have 

where z i = <f>(y { ) is defined by 1 /*«< 

Clearly has a maximum value *Jnj{n — 1) at z { — 0, i.e. when y = \, and the probability 
law is symmetrical about this point. p(y { ) is zero at the points y { = 0(z 1 = — co) and 
y i = l(z 1 = +oo). A graph of the function for three different values of n is given in Fig. 1. 
In order to compare p(Pi) with the rectangular distribution we may find the points at which 
the curve crosses it. This will be when p(y<) = 1 or when 


--jiogf 1 -!). 


2 (»— 1 ) 

Expanding the logarithm as a series we have that 

, i 11 

Zl ~ 2» + 12n* •"» 

or, for n moderately large, z < is nearly equal to ±1. It follows that p(y i ) = 1 when 
y { =st 0-159 or 0-841. 

In case (ii) for the same p(x), assume 

*(*)-*. D(a:) = « = (*<-*)*]*• 



F. N.' David and N. L. Johnson 


187 


As before, write 
then p(z t ) = 


It follows that 


x 4 —Z 


, nz\ » — 1 »— 1 


Jn 


;(■ 




= J / , _ ng ? \ i(n ~ 4) rt ,,.. 

n-1 B(M(»-2))\ (»-!)*/ 






n = 6 


Scale oi y, 
n= 11 — 


n = 21 


• n = 6 


Fig. 1 


Scale of 

n= 11 

Fig. 2 


• - — ti 21 


Fig. 1. The probability integral transformation applied to the normal curve with estimated mean. 
Fig. 2. The probability integral transformation applied to the normal curve with estimated 
mean and standard deviation. 

/ 1 r±VC2 + (l/n)) # \ 

(Maxima approximately at yi = — ■ J e&.l 


where, as before, 


t/i=-7TX~-T I 


1 f ** 

- c-* 1 

V(27T)J_c 

A graph of this function, for the same sample sizes considered in case (i), is given in Fig. 2. 


7. Example II. Let x be distributed as‘x a with two degrees of freedom, i.e. let 

p(x) = ~ X,B = f(x 1 6) for x > 0. 




The probability integral transformation 
M(x) = x, 

-If”. 

xj 0 


188 

If we estimate 6 by 

then «/< = = (* e^&clt = 1 - 

In this case we know, writing u< x j Xj 

that p(u t ) = (i - (0 < u { < n). 

Following the procedure of the previous seo- 1.3 
tions, we have that 

*»«> - ^ir rrp; [‘ +; ><>8 ^ 

for 0 < y { < 1 — e~ n . (6) 

A graph of this function is given in Fig. 3 . 

8. The joint-probability law of the y { ’ s for 
Example I of § 6 follows from an application 
of § 3 . If we are considering a normal distri- 
bution and if M(x) = x then 


Vi 





Since the quantities x t — x are independent of 
x it follows that the y ' s are also independent 
of x. The most convenient formula to use 
would seem to be 

.. p{x l ,...,x n _ x ,x\i,cr) 

P\Hly •••iVn- V x \ *> a ) JT^l • 

n /(*< | a) 

i-1 

in-i 1 

Since x = - 2 x i + - x n 

Him* i n 


0-3 

0*2 

CM 

<H 


0 0*1 0-2 0-3 04 0-5 0*6 07 0-8 0-9 1-0 
Scale of y t 

n = 6 n= 11 n = 21 


Fig. 3. The probability integral transforma- 
tion applied to the law 

p(x) = ~e—l e , 


and s{—X ^= — &( (—X V"— (— ® being estimated from the data. (The maxiim. 
\n n ) n’ \\n ”/ \nj ) n’ is at 1 — e -, = 0-885 whatever be n.) 


it is clear that 

p(x\x v ...,x n _ v £,(r) = 

Hence 




0 


n — 1 
i-1 


T)- 


^( 27 !) cr 6XP | 

*>-x. *!£.»•)- y<2^«P ( - 2^5 (V <*, - D* +[<*-{> - <*, - *)]*}) , 


* nd n‘/(*,i*,<x) = <*,-*)■]. 

The joint-probability law of y v y % , .... t/ n _ x and * will be 

*<»> »«-i.*i£. <r ) = ~ si* {* <ai ~ a ’ + Cl* <**-*>]*}) 

- 7 (^ exp l 





F. N. David and N. L. Johnson 


189 


( 0 ) 


tf>(y { ) being defined as in (1). Integrate out for z, and we have 

Piih> | !» <r) - V»e x p[“|{*S $KVi)}*]- 

It will be noted that this joint probability law is independent of both £ and cr. 

9. The exponential law disoussed in § 7 differs from the examples in which the normal law 
was used in that in this particular example a measure of location is used to estimate a scale 

parameter. We have i 

p(x) = g e ~ xie > 


whenoe, since 
it is seen that 

and also 
It follows that 

and 


_ l*-, 1 1 

* * - S x i + ~ x n> 
U 


p(x\x v ...,x n _ v 0) = ^exp(-^»-^S x<Jj, 

1 / 

p{x v x a , ...,x n _ x 1 6) = — exp J . 

P( x v •» *»_i, *l*) = £ex P (-f), 

n— 1 ! / 

Uffri I x ) = r^exp . 


The joint-probability law of y v ...,y n _ x , and x follows in a straightforward way, namely 

-’Vn-v x 1 0 ) = 

Remembering that y — e -x^x 0 

the joint-probability law may be rewritten 


p(yv-,yn-i> x \ 


ft"- 1 ( 

r »-i -i 

La S *i 
nx {_i 

ex n~ 

— i 

jlH 

1 

1* 

^ = -log(l-J/<) 

JC 

(?V* 1 e -n*ie 

i 

\o) 

.i-i j 


where 

Integrating out with respect to x, 


n [~log(l -!/<)] <n or \\ (\-y t )<e- n . 

M i-1 


p(yv->y n -i\d) - 


(n-l)! 


1 


(ft 1 ltl —1 9 

( ) n(i-y<) 

i-1 


n (i - y<) < e ~ n > 

i-i 


( 7 ) 


again a result which is independent of the parameter of the probability law. 


. 10. .The results of this investigation, which we have carried out partly in the general and 
partly in the particular, are obviously incomplete and should be succeeded by a fuller 
inquiry which would dear up the doubtful points which we have had to pass over, and 
possibly extend the general theory a little further. We feel that none of the questions which 



190 The probability integral transformation 

have been raised in the course of this inquiry are insoluble by algebraio analysis but it is 
uncertain whether it is profitable to proceed with the fuller inquiry until some of the 
statistical implications of what has been done become more clear. For example, we have 
noted that given n independent random variables, x, if a sample moments are calculated 
from them and used as estimates of the parameters of the probability law, then it appears 
that there will be s independent relationships between the y' s. Thus in this oase the 
point y v y t , ...,y n is constrained to move in an n-s dimensioned space within an n dimen- 
sioned cube, and we have the exact analogue to the loss of degrees of freedom with x* when 
the parameters have to be estimated from the data. What is not clear is how the y'n are 
constrained when the sample estimates of the parameters are not the sample moments, and 
while this situation may not often be met with in practice, yet it should be explored. 

When the parameters of location and scale are estimated from the data it is clear that the 
distribution of any individual y it and the joint-probability law of the y ’ s also, will not be 
dependent on these unknown parameters of the probability law of tho x'a, but will depend 
on the functional form of that law. This result appears capable of extension for the oase 
when higher sample moments are also used for estimating parameters. This being so, there 
would seem to be two ways in which the joint probability law of the y ' s may be utilized in 
statistical applications. First, it should be possible mathematically to form certain broad 
classes of functions for each of which the joint-probability laws of the y ' s would be approxi- 
mately the same, or second one may seek for some transformation of variables so that 
instead of the correlated y t we obtain n — s new independent. variables following some 
distributions which are independent of the original p(x). Both these methods of attack 
may lead to results which will only be valid for large samples, but provided the results in 
either case have sufficient algebraic simplicity they should make possible certain generaliza- 
tions in statistical analysis of which Neyman’s 1 smooth ’ test for goodness of fit is only one 
important example. 


REFERENCES 

Fisher, R. A. (1932). Statistical Methods for Research Workers, § 21*1. 
Neyman, J. (1937). Skand. AktuarTidskr. 20, 149. 

Pearson, E. S. (1938). Biomeltika, 30, 134. 

Pearson, K. (1933). Biometrika, 25, 379. 



C 101 ] 


A TABLE FOR THE CALCULATION OF WORKING PROBITS 
AND WEIGHTS IN PROBIT ANALYSIS 


By D. J. FINNEY ( Lecturer in the Design and Analysis of Scientific Experiment, 
University of Oxford) and W. L. STEVENS ( Admiralty ) 


The estimation of the parameters of a distribution of individual tolerances, from data 
relating to numbers of subjects manifesting a characteristic quantal response at different 
levels of a stimulus, is a problem frequently encountered in the application of statistical 
science to dose-mortality studies, biological assay, detonation of explosives, and other 
problems. A typical situation is that of exposing batches of insects to various doses of an 
insecticide, recording the proportion killed at each level of dose and then requiring to 
estimate the mean tolerance (or median lethal dose) of individual insects and the variance 
of the tolerance distribution. Gaddum (1933) and Bliss (1935a, b; 1938) have been instru- 
mental in developing a method, that of the probit transformation, which greatly simplifies 
the calculations necessary to the estimation. The exact statistical analysis appropriate to 
the transformation was first shown by Fisher (1935), and the theory and uses of the method 
have been discussed fully in many subsequent publications (Finney, 1947 a, b). 

Tables required in the practice of the method, in sufficient detail for most purposes, have 
been given by various writers (Fisher & Yates, 1943; Finney, 1947 a). Occasionally, however, 
the statistician needs values of the various functions at finer intervals of the argument, and 
for his benefit the following Table has been prepared. A brief acoount of the tabulated 
functions will suffice for all who are familiar with the probit method; those who require 
fuller information on the theory and analysis should consult the list of References. 

Given a proportion P, and its complement Q = 1 — P, the probit of P is, to all intents and 
purposes, the deviate from the mean which divides the normal curve of unit variance in the 
ratio P: Q. In the formal definition, however, 5 is added to the deviate in order to avoid the 
necessity of computing with negative numbers. The advantage of this modification may be 
questioned, but it is now well established and will be adopted here. The probit, Y, of the 
proportion P is thus defined by 

1 fY-5 

P = ^r-J er**du. 

V(2w)J_oo 

The standard method of analysis makes use of the maximum and minimum working probits, 


and 


y 

-‘max. 

y 

-‘min. 


= F + 
= Y — 


Q 

z 

p 

Z' 


and also of the range, 


Z 


1 IZ, 
1 

V(2") 


where 


e -KF-«\ 



192 Table of working probite 

If n subjects receive the same stimulus, and r of them show the characteristic response, 
the empirioal value for the proportion responding is 

p = r/n; 

the complement of this is denoted by q = 1 —p. The probits of a set of values of p should 
be approximately linearly related to x, the measure of the stimulus, and a line fitted by eye 
may be used to give a corresponding set of expected pr obits, 7. The working probit corre- 
sponding to each proportion is next oaloulated, from either 

Y+Q/Z-qlZ, 

or y — Y-P/Z+p/Z, 

using tabulated values of the maximum or the minimum working probit (whichever is the 
more convenient) and the range. An improved set of expected probits is then derived from 
the weighted linear regression equation of working probits on x, each y being assigned 
a weight, nw, where the weighting coefficient, w, is defined as 

w = Z'/PQ. 

The process may be repeated with the new set of Y values. The iteration converges to give 
a linear regression equation which is an estimate of 

7 = 5 + (x-p)lcr, 

where fi is the mean and a the standard deviation of the tolerance distribution. The method 
depends upon an assumption that the stimulus is measured on a scale for which individual 
tolerances are normally distributed: often the logarithm of ‘dose’ rather than dose itself 
is taken as a;, in order to satisfy this condition more closely. 

The Table which follows gives 7 m ., for 7 = 3*58(0*01) 9-00, 7 mln for 7 = 1*00(0*01) 
6*42, \jZ and w for 7 = 1*00(0*01) 9-00, all to four places of decimals. Below 7 = 3-58, 
7 max exceeds 10*00, and above 7 = 6*62, 7 mlJJ is negative; it is then almost always more 
convenient to calculate working probits from the other function, but the function not 
tabulated can easily be obtained from the relationship 

Yra^-Tmn. - 1/f 

Between 3*58 and 6*42 both functions are tabulated. In order to save Bpace the Table is 
arranged in parallel forward- and backward -reading oolumns; for the arguments 7 and 
(10—7) values of 1/Z and w are the same, and simple relations exist between 7 max and 
7 m i„.. All entries have been calculated to six or more places of decimals and rounded to four, 
except that w between 7 = 2*7 and 7 = 7*3 was obtained by collating two existing tables 
and checking discrepancies. 

The values of P and Z, from which the present Table has been oaloulated, were taken 
from Tables of the Probability Function, Vol. n (1942), published by the Federal Works 
Project Administration for the City of New York. Values of QjZ ' have been taken from, or 
checked against, W. F. Sheppard’s table, published as The Probability Integral (1939), 
Vol. vn of the British Association Mathematical Tables. 

Example 

In a batch of 281 insects receiving the same dose of insecticide, 119 are killed. The provi- 
sional probit regression line gives an expected probit of 4*81 for this dose; find the working 
probit and the weight to be attached to the observation. 



D. J. Finney and W. L. Stevens 


193 


Expected 

probit 

y 

Maximum 

working 

probit 

Minimum 

working 

probit 

Range 

HZ 

Weighting 

^coefficient 

Z*/PQ 





Y + Q/Z 

Y-P/Z 




500 

6 2533 

3-7467 

2-5066 

0-6366 

6-2533 

3-7407 

500 

•01 

•2534 

•7466 

•5068 

•6366 

•2534 

•7406 

4-99 

•02 

•2536 

•7465 

•5071 

•6365 

•2535 

•7464 

•98 

•03 

•2539 

•7461 

•5078 

•6364 

•2539 

•7461 

•97 

*04 

•2543 

•7457 

•5086 

•6362 

•2543 

•7457 

•96 

5*05 

6-2548 

3-7450 

2-6008 

0-6360 

0-2560 

3-7452 

4-95 

•06 

•2555 

•7444 

•5111 

•6358 

•2550 

•7445 

•94 

•07 

•2563 

•7435 

•5128 

•6355 

•2505 

•7437 

•93 

•08 

•2572 

•7425 

•5147 

*6351 

•2575 

•7428 

•92 

•09 

•2582 

•7414 

-5168 

•6347 

■2580 

•7418 

•91 

510 

6-2593 

3*7401 

2-5192 

0-6343 

6-2599 

3-7407 

4*90 

•11 

•2605 

•7387 

•5218 

•6338 

•2613 

•7395 

•89 

•12 

•2618 

•7371 

•5247 

•6333 

•2629 

•7382 

•88 

•13 

•2632 

•7353 

•5279 

•6327 

•2047 

•7308 

•87 

•14 

•2647 

•7334 

•5313 

•6321 

•2066 

•7353 

•86 

515 

6-2664 

3-7314 

2*5350 

0*6314 

0-2686 

3-7336 

4-85 

•16 

•2681 

•7292 

•5389 

•6307 

•2708 

•7319 

•84 

•17 

•2699 

•7268 

•5431 

•6300 

•2732 

•7301 

•83 

•18 

•2718 

•7242 

•5476 

•6292 

•2758 

•7282 

•82 

•19 

•2738 

•7215 

•5523 

•6283 

•2785 

•7262 

•81 

5-20 

6-2759 

3-7186 

2-5573 

0-6274 

6-2814 

3-7241 

4-80 

•21 

•2781 

•7156 

•5025 

•0266 

•2844 

•7219 

•79 

•22 

•2804 

•7124 

•6080 

•0255 

•2876 

•7196 

•78 

•23 

•2828 

•7090 

•5738 

•6245 

•2910 

•7172 

•77 

•24 

•2863 

•7054 

•5799 

•6234 

•2940 

•7147 

•76 

5-25 

6-2878 

3*7016 

2-5862 

0-6223 

6-2984 

* 3-7122 

4-75 

•26 

•2905 

•6977 

-5928 

•6211 

•3023 

•7095 

*74 

•27 

•2932 

•6935 

•5997 

•6199 

•3065 

•■*068 

•73 

•28 

•2960 

•6892 

•6068 

•6187 

•3108 

•7040 

*72 

•29 

•2989 

•6846 

-6143 

•6174 

•3154 

•7011 

*71 

5-30 

6*3018 

3-6798 

2-6220 

0-6161 

6-3202 

3-0982 

4-70 

•31 

•3049 

•6749 

•6300 

•6147 

•3251 

•0951 

•69 

•32 

•3080 

•6697 

•6383 

•6133 

•3303 

•6920 

•68 

•33 

•3112 

•6643 

•6469 

•6119 

•3357 

•0888 

*67 

•34 

•3145 

•6587 

•6558 

•6104 

•3413 

•6855 

•66 

5-35 

6*3178 

3*6528 

2*6650 

0*6088 

0-3472 

3-6822 

4*65 

•36 

•3213 

•6469 

•6744 

•6072 

•3531 

•6787 

•64 

•37 

•3248 

•6406 

•6842 

•6056 

•3694 

•0752 

•63 

•38 

•3283 j 

•6340 

•6943 

•0040 

•3060 

•6717 

•62 

*39 

•3320 

•6273 

•7047 

•6023 

•3727 

•0680 

•61 

5 40 

6*3357 

3-6203 

2-7154 

0-6005 

0-3797 

3-6643 

4*60 

41 

•3394 

•6130 

*7264 

•5987 

•3870 

•0606 

•59 

•42 

•3433 

•6055 

•7378 

•5969 

•3945 

•0567 ! 

•58 

•43 

•3472 

•6978 

•7494 

•5951 

•4022 

•6528 | 

•57 

•44 

•3512 

•5898 

•7614 

•5932 

•4102 

*6488 j 

•56 

5-45 

6*3552 

3-5815 

2-7737 

0*5912 

0*4185 

3-6448 

4-55 

46 

•3593 

•5729 

•7864 

:5893 

•4271 

•0407 

•54 

•47 

•3635 

•6641 

•7994 

•5872 

•4359 

•6365 

•53 

*48 

•3677 

•5550 

•8127 

•5852 

•4450 

•6323 

•52 

•49 

•3720 

•5456 

•8264 

•5831 

•4544 

•6280 

•51 

5*50 

6*3764 

3*5360 

2-8404 

0-5810 

6-4040 

3*6236 

4*50 





Z'jPQ 

Weighting 

coefficient 

Y + Q/Z 

Y-P/Z 

Y 




l/Z 

Maximum 

Minimum 

Expected 




Range 

working 

probit 

working 

probit 

probit 


Biometrika 35 


13 


194 


Table of working probits 


Expected 

probit 

V 

Maximum 

Minimum 


Weighting 


• 


working 

working 

Range 

coefficient 




probit 

probit 

Hz 

Z'/PQ 





Y + Q/Z 

y-p/z 






5-50 

6-3764 

3-5360 

2-8404 

0-5810 

6-4640 

3-6236 

4*50 

•51 

•3808 

•5260 

•8548 

•5788 

•4740 

•6192 

•49 

•52 

•3852 

*5157 

•8695 

•5766 

•4843 

•6148 

•48 

•53 

•3808 

•5052 

*8846 

•5744 

•4948 

*6102 

•47 

•54 

•3944 

•4943 

•9001 

•5722 

•5057 

•6056 

*•46 

5-55 

6*3990 

3-4831 

2-9159 

0-5699 

6*5169 

3*6010 

4*45 

•56 

•4037 

•4715 

•9322 

•5676 

•5285 

•5963 

•44 

•57 

•4085 

•4597 

•9488 

•5652 

•5403 

•6915 

•43 

•58 

•4133 

•4475 

•9658 

•5628 

•5525 

•5867 

•42 

•59 

•4181 

•4349 

•9832 

•5603 

•5851 

•5819 

•41 

5-60 

6-4230 

3-4220 

3-0010 

0-5579 

6-5780 

3-5770 

4-40 

•61 

•4280 

•4088 

•0192 

•5554 

*5912 

•5720 

•39 

•62 

•4330 

•3952 

•0378 

•5529 

•6048 

•6670 

•38 

•63 

•4381 

•3812 

•0569 

•5503 

•6188 

s 5619 

•37 

•64 

•4432 

•3669 

•0763 

•5477 

•6331 

•5568 

•36 

5-65 

6-4484 

3-3522 

3-0962 

0*5451 

6-6478 

3-5516 

435 

•66 

•4536 

•3370 

•1166 

•5425 

•6630 

•5464 

•34 

•67 

•4588 

•3214 

•1374 

•5398 

•6786 

•5412 

*33 

•68 

•4641 

•3055 

•1586 

•6371 

•6945 

*5359 

•32 

•69 

•4695 

•2892 

•1803 

•5343 

•7108 

•5306 

•31 

5-70 

6-4749 

3-2724 

3-2025 

0-5316 

6*7276 

3*5251 

4*30 

•71 

•4803 

•2651 

•2252 

•5288 

•7449 

•5197 

•29 

•72 

•4858 

•2375 

•2483 

•5260 

•7625 

•5142 

*28 

•73 

•4914 

•2194 

•2720 

•5232 

•7806 

*5086 

•27 

•74 

•4969 

•2008 

•2961 

•5203 

•7992 

•5031 

•26 

5-75 

6-5026 

31819 

3-3207 

0-5174 

6-8181 

3-4974 

4*25 

•76 

•5082 

•1623 

•3459 

•5145 

•8377 

•4918 

•24 

•77 

•5139 - 

•1423 

•3716 

•5116 

•8577 

•4861 

•23 

•78 

•5197 

•1219 

•3978 

•5086 

•8781 

•4803 

•22 

•79 

•5255 

'•1009 

•4246 

•6056 

•8991 

•4745 

•21 

5*80 

6-5313 

30794 

3-4519 

0-5026 

6*9206 

3*4687 

4*20 

•81 

•5372 

*0574 

•4798 

•4996 

•9426 

•4628 

•19 

•82 

•5431 

•0348 

•5083 

•4965 

•9652 

*4569 

•18 

•83 

•5490 

•0116 

•5374 

•4935 

•9884 

•4510 

•17 

•84 

•5550 

2-9880 

•5670 

•4904 

701 20 

•4450 ! 

•16 

5-85 

6*5611 

2-9638 

3-5973 

0-4873 

7*0362 

3*4389 

4*15 

•86 

*5671 

•9389 

•6282 

•4841 

•061 1 

•4329 

•14 

•87 

•5732 

•9135 

•6597 

•4810 

•0865 

•4268 

•13 

•88 

•6794 

•8875 • 

*6919 

•4778 

•1125 

•4206 

*12 

•89 

•5855 

•8608 

•7247 

*4746 

•1392 

•4145 

•11 

5*90 

6-5917 

2*8335 

3*7582 

0*4714 

7- 1865 

3-4083 

4*10 

•91 

•5980 

•8056 

•7924 

•4682 

•1944 

•4020 

•09 

•92 

•6043 

•7771 

•8272 

•4650 

•2229 

•3957 

•08 

•93 

•6106 

•7478 

•8628 

•4617 

•2522 

•3894 

•07 

•94 

•6169 

•7178 

•8991 

*4585 

•2822 

•3831 

•06 

5-95 

6-6233 

2*6872 

3*9361 

0-4552 

7-3128 

3-3767 

4*05 

•96 

•6297 

•6558 

•9739 

•4519 

•3442 

•3703 

•04 

•97 

•6362 

•6238 

4*0124 

•4486 

•3762 

•3638 

•03 

•98 

•6426 

•5909 

•0617 

•4453 

•4091 

•3574 

•02 

•99 

•6491 

•5573 

•0918 

•4420 

•4427 

•3509 

•01 

600 

6-6557 

2-5230 

4*1327 

0*4386 

7-4770 

3*3443 

400 


i/z 

Range 

Z*/PQ 

Weighting 

coefficient 

Y + Q/Z 
Maximum 
working 
probit 

Y-P/Z 

Minimum 

working 

probit 

Y 

Expected 

probit 





Expected 
I probit 
F 


600 

•01 

•02 

•03 

•04 

605 

•06 

•07 

•08 

•09 

6-10 

11 

•12 

•13 

•14 

615 

•16 

•17 

•18 

•19 

6-20 

•21 

•22 

•23 

•24 

6-25 

•26 

*27 

•28 

*29 

6-30 

•31 

•32 

•33 

•34 

6-35 

•36 

•37 

•38 

*39 

6*40 

•41 

•42 

•43 

•44 

6-45 

•46 

•47 

•48 

•49 

6-50 


D. J. FitfNrfir W. L. Stevens 


195 


Maximum 

Minimum 


Weighting 




working 

working 

• Range 




probit 

f+QIZ 

probit 

F-P/Z 

l/Z 

Z*/PQ 




6-6557 

2-5230 

41327 

0-4388 

7-4770 

3-3443 

400 

•6623 

•4878 

*1745 

•4353 

•5122 

•3377 

3*99 

•6680 

•4518 

•2171 

•4319 

•5482 

•3311 

•98 

•6755 

•4150 

•2605 

•4285 

•6860 

*3245 

•97 

•6822 

*3774 

•3048 

•4252 

•6226 

•3178 

•96 

6-6888 

2-3387 

4-3501 

0*4218 

7*6613 

3-3112 

3-95 

•6056 

•2994 

-3962 

•4184 

*7006 

•3044 

•94 

•7023 

•2590 

•4433 

•4150 

*7410 

•2977 

•93 

*7091 

•2178 

•4913 

•4116 

•7822 


*92 

•7150 

•1756 

•5403 

•4082 

•8244 

•2841 

•91 

6-7227 

21324 

4-5903 

0-4047 

7-8676 

3-2773 

3*90 

•7296 

•0883 

•6413 

•4013 

•9117 


•89 

•7365 

•0432 

•6933 

•3979 

•9568 

•2635 

•88 

•7434 

1*9970 

•7464 

•3044 

8-0030 

•2566 

•87 

•7504 

•9498 

•8006 

•3910 

•0502 

•2496 

•86 

6*7573 

1*9014 

4-8569 

0-3876 

8-0986 

3-2427 

3-85 

•7643 

•8520 

•9123 

•3841 

•1480 

•2357 

•84 

•7714 

•8016 

•9698 

•3807 

•1984 

•2286 

•83 

•7784 

•7498 

50286 

•3772 

•2502 

•2216 

•82 

•7855 

•6970 

•0885 

•3738 

•3030 

•2145 

•81 

6-7926 

1-6429 

51497 

0-3703 

8-3571 

3*2074 

3*80 

•7997 

•5876 

•2121 

•3669 

•4124 

•2003 

•79 

•8068 

•5310 

•2758 

*3634 

•4690 

•1932 

•78 

•8140 

•4731 

•3409 

•3600 

•6269 

•I860 

•77 

•8212 

•4140 

•4072 

•3565 

•5860 

•1788 

•76 

6-8284 

1-3534 

5-4750 

0-3531 

8-6466 

31716 

3-75 

•8357 

•2916 

•5441 

•3496 

•7084 

•1643 

•74 

•8429 

•2282 

•6147 

•3462 

•7718 

•1571 

•73 

•8502 

•1635 

•6867 

•3428 

•8365 

*1498 

•72 

•8575 

•0972 

•7603 

•3393 

•9028 

*1426 

•71 

6*8649 

1 0295 

5-8354 

0*3359 

8*9705 

31351 

3-70 

•8722 

0-9602 

•9120 

•3325 

9*0398 j 

•1278 

•69 

•8796 

•8893 

•9903 

•3291 

•1107 

•1204 

•68 

•8870 

•8168 

60702 

■3256 

•1832 

*1130 

•67 

•8944 

•7426 

•1518 

•3222 

•2574 

•1056 

•66 

6-9019 

0-6668 

6-2351 

0-3188 

9*3332 

3*0981 

3*65 

•9093 

•5892 

•3201 

•3155 

•4108 

•0907 

•64 

•9168 

•5098 

•4070 

•3121 

•4902 

•0832 

•63 

•9243 

•4286 

•4957 

•3087 

•5714 

•0757 

•62 

•9318 

•3455 

•5863 

•3053 

•6545 

•0682 

•61 

6-9394 

0-2606 

6-6788 

0-3020 

9-7394 

30606 

3-60 

•9469 

•1736 

•7733 

•2986 

•8264 

•0531 

•59 

•9645 

•0847 

•8698 

*2953 

•9153 

•0455 

•58 

•9621 


•9684 

•2920 


•0379 

•57 

•9697 


7 0691 

•2887 


•0303 

•56 

6-9774 


71720 

0-2854 


30226 

3*55 

*9850 


*2771 

•2821 


•0150 

•54 

6-9927 


•3845 

•2788 


•0073 

•53 

7-0004 


•4943 

•2756 


2-9996 

•52 

*0081 


•6064 

•2723 


•9910 

•51 

70158 


7*7210 

0-2691 


2-9842 

3*50 




Z*/PQ 

Weighting 

coefficient 

Y + Q/Z 

Y — P/Z 

F 



l/Z 

Range 

; 

Maximum 
working - 
probit 

Minimum 

working 

probit 

Expected 

probit 

t 




196 


Table of working probits 


Expected 

probit 

F 

Maximum 

working 

probit 

Y + Q/Z 

Range 

m 

Weighting 

coefficient 

Z*/PQ 

. 

6*50 

7-0158 

7-7210 

0-2691 

2-9842 

3-50 

>61 

•0236 

•8380 

•2658 

•9764 

•49 

*52 

•0313 

•9577 

•2626 

•9687 

•48 

•53 

•0391 

8*0800 

•2594 

*9609 

•47 

*54 

•0489 

•2050 

•2563 

•9531 

•46 

6*55 

70547 

8*3327 

0-2531 

2-9453 

3*45 

•56 

•0625 

•4633 

•2500 

•9375 

•44 

•57 

•0704 

•5968 

•2468 

•9296 

•43 

*58 

•0783 

•7333 

•2437 

*9217 

•42 

•59 

•0881 

•8728 

•2406 

•9139 

•41 

6*60 

7-0940 

90154 

0*2375 

2-9060 

3-40 

•61 

•1020 

•1613 

•2345 

•8980 

•39 

•62 

•1099 

•3105 

•2314 

•8901 

•38 

•63 

•1178 

•4630 

•2284 

•8822 

•37 

•64 

•1258 

•6190 

•2254 

•8742 

•36 

6*65 

71338 

9*7785 

0-2224 

2-8662 

3-35 

•66 

•1417 

•9417 

•2194 

•8583 

34 

•67 

•1498 

10*1086 

•2165 

•8502 

•33 

•68 

•1578 

10-2794 

•2135 

•8422 

•32 

•69 

•1658 

10-4540 

•2106 

•8342 

•31 

6*70 

71739 

10*6327 

0*2077 

2*8261 

3-30 

•71 

•1819 

10-8166 

•2049 

•8181 

•29 

•72 

•1900 

110027 

•2020 

•8100 

-28 

•73 

•1981 

111941 

•1992 

•8019 

•27 

•74 

•2082 

11-3900 

•1964 

•7938 

•26 

6-75 

7*2143 

11-5905 

01 936 

2-7857 

3-25 

•76 

•2224 

11-7957 

•1908 

•7776 

24 

•77 

•2306 

120058 

•1881 

•7694 

•23 

•78 

•2387 

12-2208 

*1853 

•7613 

•22 

*79 

•2469 

12-4409 

-1826 

•7531 

•21 

6*80 

7-2551 

12-6662 

01 799 

2*7449 

3-20 

•81 

•2633 

12-8969 

*1773 

*7367 

•19 

•82 

•2715 

131331 

•1746 

•7285 

•18 

•83 

■2797 

13-3750 

•1720 

•7203 

•17 

•84 

•2880 

13-6227 

•1694 

•7120 

•16 

6*85 

7-2962 

13-8764 

0-1669 

2*7038 

315 

•86 

•3045 

141362 

*1643 j 

•6965 

•14 

•87 

•3128 

14*4023 

•1618 ! 

•6872 

•13 

•88 

•3210 

14-6749 

•1593 

•6790 

•12 

•89 

•3293 

14-9541 

•1568 

•6707 

•11 

6*90 

7*3376 

15-2402 

01544 

2-6624 

310 

•91 

•3460 

15*5333 

•1519 

•6540 

•09 

•92 

•3543 

15*8337 

•1495 

•6457 

•08 

•93 

•3626 

161414 

•1471 

•6374 

•07 

•94 

•3710 

16-4568 

•1448 

•6290 

•06 

6*95 

7-3794 

16-7800 

0-1424 

2-6266 

305 

•96 

•3877 

171113 

•1401 

•6123 

•04 

•97 

•3961 

17-4509 

•1378 

•6039 

•03 

•98 

•4045 

17-7989 

•1356 

•5955 

•02 

•99 

•4129 

18-1558 

*1333 

•5871 

•01 

700 

7-4214 

18-5216 

01311 

2*5786 

300 


\/z 

Range 

Z*/PQ 

Weighting 

coefficient 

Y-P/Z 

Minimum 

workings 

probit 

F 

Expected 

probit 



D. J. Finney and W. L. Stevens 


197 


Expected 

probit 

Y 

Maximum 

working 

probit 

Y + Q/Z 

Range 

i/z 

Weighting 

coefficient 

Z*/PQ 


700 

7-4214 

18-5216 

0*1311 

2-5780 

300 

•01 

•4298 

18-8967 

•1289 

•5702 

2-99 

•02 

•4382 

19-2814 

•1208 

•5018 

•98 

•03 

•4467 

19-8758 

•1246 

•5533 

•97 

•04 

•4552 

200803 

•1225 

•5448 

•96 

705 

7-4638 

20-4952 

0-1204 

2-5364 - 

2-95 

•06 

•4721 

20-9207 

•1183 

•5279 

•^4 

•07 

•4806 

21-3572 

. 1103 

•5194 

•93 

•08 

•4801 

21-8050 

•1142 

•5109 

•92 

•09 

•4976 

22-2644 

•1122 

•5024 

•91 

710 

7-5062 

22-7357 

0-1103 

2-4938 

2-90 

•11 

•5147 

23-2194 

•1083 

•4853 

•89 

•12 

•5232 

23-7157 

•1064 

•4708 

-88 

•13 

•5318 

24-2251 

•1045 

•4682 

•87 

•14 

•5404 

24-7478 

•1020 

•4590 

•86 

715 

7-5489 

25-2844 

0-1007 

2-4511 

2-85 

•16 

•6575 

25-8352 

*0989 

•4425 

•84 

•17 

•5661 

20-4006 

•0971 

•4339 

•83 

•18 

•5747 

26-9812 

•0953 

-4253 

•82 

•19 

*5833 

27-5772 

•0935 

•4107 

-81 

7-20 

7-5919 

28*1892 

0-0918 

2-4081 

2-80 

•21 

•6006 

28-8177 

•0901 

•3994 

*79 

•22 

•6092 

29-4631 

•0884 

•3908 

•78 

•23 

•6178 

30*1260 

•0867 

-3822 

•77 

•24 

•6265 

30-8069 

•0851 

•3735 

*76 

7*25 

7-0351 

31-5003 

0-0834 

2-3049 

2-75 

•26 

•6438 

32-2249 

•0818 

•3562 

•74 

•27 

•6525 

32-9031 

•0802 

•3475 

•73 

•28 

•0012 

33-7210 

-0787 

•3388 

•72 

•29 

•6099 

34-5010 

-0771 

-3301 

*71 

7-30 

7-0780 

35-3020 

0-0756 

2-3214 

2-70 

*31 

•0873 

30-1251 

•0741 

•3127 

•69 

•32 

•6960 

30-9712 

•0727 

*3040 

•68 

•33 

•7047 

37-8408 

•0712 

•2963 

•67 

•34 

•7135 

38-7348 

•0698 

•2805 

•66 

7*35 

7*7222 

39-6539 

0-0684 

2-2778 

2-65 

• -36 

•7310 

40-5988 

•0071 

•2090 

•64 

•37 

•7397 

41-5704 

•0656 

•2003 

•63 

•38 

•7485 

42-5095 

•0043 

*2515 

•62 

•39 

•7573 

43-5970 

•0630 

•2427 

•61 

7-40 

7-7801 

44-6538 

0-0017 

2-2339 

2-60 

•41 

•7748 

45-7407 

•0004 

•2252 

•59 

•42 

•7830 

46-8588 

•0591 

•2164 

•58 

-43 

•7924 

48-0090 

•0579 

•2070 

•57 

•44 

•8013 

49- 1924 

•0567 

•1987 

•56 

7-45 

7*8101 

50*4099 

0-0555 

2-1899 

2-55 

•46 

•8189 

51*6628 

•0543 

•1811 

•54 

•47 

•8277 

52-9521 

•0532 

•1723 

•53 

•48 

•8366 

54-2791 

•0520 

•1634 

•52 

•49 

•8454 

55-6448 

•0509 

•1546 

•51 

7-50 

7-8543 

57-0600 

0-0498 

2- 1457 

2-50 


l/Z 

Range 

Z*/PQ 

Weighting 

coefficient 

y - p/z 

Minimum 

working 

probit 

Y 

Expected 

probit 


*3-3 



198 


Table of working probits 


Expected 

probit 

F 

Mftyirmirr ^ 

working . 
probit 

Y + Q/Z 

Range 

1 /z 

Weighting 

coefficient 

Z*/PQ 


7*50 

7*8543 

57-0606 

' 0-0498 

2- 1457 

2-50 

51 

•8631 

58-4978 

•0487 

*1369 

•49 

•52 

*8720 

59-9876 

•0476 

•1280 

*48 

•53 

•8809 

61-5216 

•0466 

•1191 

•47 

•54 

•8897 

63* 1011 

•0456 

•1103 

•46 

7-55 

7*8986 

64-7277 

0*0446 

2- 1014 

2*45 

•56 

•9076 

66-4028 

•0436 

•0925 

•44 

•57 

•9164 

68- 1280 

•0426 

•0836 

•43 

•58 

•9253 

69-9051 

•0416 

•0747 

•42 

: 59 

•9342 

71-7367 

•0407 

*0658 

•41 

7-60 

7-9432 

73-6216 

0-0398 

2-0568 

2-40 

•61 

•9521 

75-5646 

•0389 

•0479 

•39 

•62 

•9610 

77-5667 

*0380 

•0390 

•38 

*63 

•9700 

79-6298 

•0371 

*0300 

*37 

•64 

•9789 

81-7559 

•0362 

•0211 

•36 

7*65 

7-9879 

83-9472 

00354 

2-0121 

2*35 

•66 

•9968 

86*2059 

•0346 

•0032 

*34 

•67 

8*0058 

88-5342 

•0338 

1-9942 

•33 

•68 

•0147 

90-9344 

•0330 

•9853 

•32 

•69 

•0237 

93-4091 

•0322 

•9763 

•31 

7-70 

8-0327 

95-9607 

0-0314 

1-9673 

2*30 

•71 

•0417 

98-5918 

•0307 

•9583 

•29 

*72 

•0507 

101-3053 

•0300 

•9493 

•28 

•73 

•0597 

104- 1038 

•0292 

•9403 

•27 

•74 

•0687 

106*9903 

•0285 

•9313 

•26 

7*75 

8-0777 

109-9679 

0-0278 

1*9223 

2*25 

•76 

•0867 

113-0396 

•0272 

*9133 

*24 

•77 

•0957 

116*2088 

•0265 

•9043 

*23 

•78 

•1047 

119*4788 

•0258 

•8953 

*22 

•79 

•1138 

122-8530 

•0252 

•8862 

•21 

7*80 

8*1228 

126*3352 

0-0246 

1-8772 

2-20 

•81 

•1318 

129-9290 

•0240 

•8682 

*19 

•82 

*1409 

133-6385 

*0234 

•8591 

•18 

*83 

•1499 

137-4676 

•0228 

•8501 

•17 

•84 

•1590 

141*4206 

•0222 

•8410 

*16 

7*85 

8-1681 

145-5018 

00217 

1-8319 

215 

-86 

•1771 

149*7158 

•0211 

•8229 

•14 

•87 

•1862 

154-0671 

•0206 

•8138 

•13 

•88 

*1953 

158*5609 

•0200 

•8047 

•12 

*89 

•2044 

163-2020 

•0195 

•7966 

•11 

7-90 

8-2134 

167-9957 

0-0190 * 

1*7866 

2-10 

-91 

-2225 

172*9476 

•0185 

•7775 

•09 

•92 

*2316 

178*0632 

•0181 

•7684 

•08 

•93 

•2407 

183-3485 

•0176 

•7593 

•07 

•94 

•2498 

188-8095 

•0171 

•7502 

•06 

7-95 

8-2590 

194*4526 

0*0167 

1-7410* 

2-05 

•96 

*2681 

200-2844 

•0162 

•7319 

•04 

•97 

•2772 

206-3118 

*0158 

•7228 

*03 

•98 

•2863 

212-5418 

•0154 

•7137 

•02 

•99 

•2955 

218-9818 

•0150 

•7045 

•01 

800 

8-3046 

225-6395 

0-0146 

1*6954 

2*00 


1/2 

Range 

Z*/PQ 

Weighting 

coefficient 

Y-P/Z 

Minimum 

working 

probit 

Y 

Expected 

probit 



D. J. PlNNHY AND W. L. STEVENS 


199 


Expected 

probit 

Y 

working 

probit 

Y + Q/Z 

Range 

i/jz 

Weighting 

coefficient 

Z*/PQ 


8*00 

8*3046 

225-6395 

0-0146 

1-6954 

200 

•01 

•3137 

232-5229 

•0142 

•6863 

1*99 

•02 

•3229 

239*6402 

*0138 

•6771 

•98 

•03 

•3320 

2470000 

•0134 

•6680 

•97 

•04 

•3412 

254-6114 

•0131 

•6688 

•96 

8-05 

8-3503 

262*4836 

0*0127 

1-6497 

1*95 

•06 

• 3595 

270-6262 

•0124 

•6405 

•94 

•07 

•3687 

279-0493 

# 0120 

•6313 

•93 

*08 

•3778 

287-7634 

•0117 

•6222 

•92 

•09 

•3870 

296*7792 

•0114 

•6130 

•91 

8*10 

8-3962 

306*1082 

0-0110 

1-6038 

1*90 

11 

•4054 

315-7619 

•0107 

•5946 

•89 

•12 

•4146 

325-7527 

•0104 

•5854 

•88 

•13 

•4238 

3360932 

•0101 

•5762 

•87 

•14 

•4330 

346-7966 

•0099 

•5670 

•86 

815 

8*4422 

357*8732 

0*0096 

1-5578 

1-85 

•16 

•4514 

369*3477 

*0093 

-5486 

•84 

•17 

•4606 

381*2245 

•0090 

•5394 

•83 

•18 

•4698 

393-5226 

•0088 

*5302 

•82 

•19 

•4790 

406-2580 

•0085 

•5210 

•81 

8-20 

8*4882 

419-4476 

0-0083 

1-5118 

1*80 

21 

•4974 

433*1086 

•0080 

•5026 

•79 

•22 

•5067 

447-2593 

•0078 

•4933 

•78 

•23 

-5159 

461*9185 

•0076 

*4841 

•77 

•24 

•6251 

477*1059 

•0074 

•4749 

•76 

8*25 

8-5344 

492*8419 

00071 

1*4656 

1*75 

•26 

•5436 

509*1479 

•0069 

•4564 

•74 

•27 

•5529 

526*0459 

•0067 

•4471 

-73 

•28 

•5621 

543*5592 

•0065 

*4379 

•72 

•29 

•5714 

561*7116 

•0063 

•4286 

•71 

8-30 

8*5806 

580*5283 

0-0061 

1-4194 

1*70 

•31 

•5899 

600*0353 

•0060 

•4101 

*69 

•32 

•6992 

620*2599 

•0058 

•4008 

•68 

•33 

•6084 

641-2302 

•0056 

•3916 

•67 

•34 

•6177 

662-9758 

•0054 

•3823 

•66 

8-35 

8-6270 

685*5274 

0*0053 

1-3730 

1-65 

36 

•6363 

708*9171 

•0051 

•3637 

*64 

•37 

•6456 

733*1780 

•0050 

*3544 

•63 

•38 

•6648 

758*3451 

•0048 

•3452 

•62 

•39 

•6641 

784*4545 

•0047 

•3359 

•61 

8-40 

8-6734 

811*5439 

0-0045 

1*3266 

1-60 

41 

•6827 

839*6528 

•0044 

•3173 

•59 

•42 

•6920 

868*8222 

•0042 

•3080 

•58 

•43 

•7013 

899*0948 

•0041 

•2987 

•57 

•44 

•7106 

930*5153 

•0040 

*2894 

•56 

8-45 

8-7200 

963*1301 

0*0038 

1*2800 

1-55 

•46 

•7293 

996*9878 

*0037 

•2707 

54 

•47 

•7386 

1032- 1389 

•0036 

•2614 

•53 

•48 

•7479 

1068-6362 

*0035 

•2521 

•52 

•49 

•7672 

1106-5347 

•0034 

•2428 

•51 

8-50 

8-7666 

1145-8919 

00033 

1-2334 

1-50 


l/Z 

Range 

Z*/PQ 

Weighting 

coefficient 

Y-P/Z 

Minimum 

working 

probit 

Y 

Expected 

probit 



200 


Table of working probits 


Expected 

probit 

Y 

working 

probit 

Y+Q/Z 


Weighting 

coefficient 

Z*/PQ 


8-50 

8-7666 

1145-8919 • 

0 0033 

1-2334 

1-50 

•51 

•7759 

1186-7675 

•0032 

•2241 

*49 

•52 

•7852 

1229-2242 

•0031 

•2148 

•48 

•53 

•7946 

1273-3271 

*0030 

•2054 

•47 

•54 

-8039 

13191443 

•0029 

•1961 

•46 

8-55 

8-8133 

1366-7467 

0-0028 

1-1867 

1-45 

•56 

•8226 

1416-2085 

•0027 

•1774 

•44 

•57 

•8320 

1467-6071 

•0026 

•1680 

•43 

•58 

•8413 

1521-0232 

•0025 

•1587 

•42 

•59 

•8507 

1578-5411 

•0024 

•1493 

•41 

8-60 

8*8600 

1634-2488 

00024 

1*1400 

1-40 

•61 

•8694 

1694-2383 

-0023 

•1306 

•39 

•62 

•8788 

1756*60 55 

•0022 

•1212 

•38 

•63 

•8881 

1821-4507 

•0021 

•1119 

•37 

•64 

•8975 

1888-8785 

•0021 

•1025 

•36 

8-65 

8-9069 

1958-9983 

0-0020 

10931 

1-35 

•66 

-9162 

2031*9243 

-0019 

•0838 

•34 

•67 

•9256 

2107-7758 

•0019 

•0744 

•33 

•68 

-9350 

2186-6775 

•0018 

•0650 

•32 

•69 

•9444 

2268-7596 

•0017 

•0556 

•31 

8-70 

8*9538 

2354-1583 

0-0017 

1*0462 

1-30 

•71 

•9632 

2443-0158 

•0016 

•0368 

•29 

*72 

•9726 

2535-4807 

•0016 

•0274 

•28 

*73 

•9820 

2631-7085 

-0015 

•0180 

•27 

•74 

•9914 

2731*8615 

•0015 

•0086 

•26 

8*75 

9-0008 

2836*1096 

0-0014 

0*9992 

1-25 

•76 

•0102 

2944-630 2 

•0014 

*9898 

•24 

•77 

•0196 

3057-6091 

•0013 

•9804 

•23 

•78 

•0290 

3175-2401 

•0013 

*9710 

•22 

•79 

•0384 

3297-7264 

•0012 

•9616 

•21 

8-80 

9-0478 

3425-2801 

0-0012 

0-9522 

1*20 

•81 

•0572 

3558-1233 

•0011 

•9428 

•19 

•82 

•0667 

3696-4883 

•0011 

•9333 

•18 

•83 

*0761 

3840-6179 

•0011 

-9239 

•17 

•84 

•0855 

3990-7662 

•0010 

•9145 

•16 

8*85 

9-0949 

4147-1994 

0*0010 

0-9051 

115 

•86 

•1044 

4310-1955 

•0010 

•8956 

•14 

*87 

•1138 

4480-0457 

•0009 

•8862 

•13 

•88 

•1232 

4657-0549 

•0009 

•8768 

•12 

•89 

•1327 

4841-5419 

•0009 

•8673 

•11 

8-90 

9-1421 

5033-8407 

0-0008 

0-8579 

1-10 

•91 

•1516 

5234-3007 

•0008 

•8484 

•09 

•92 

•1610 

5443-2878 

•0008 

-8390 

•08 

•93 

•1704 

5661-1851 

•0007 

•8296 

*07 

•94 

•1799 

5888-3938 

•0007 

•8201 

-06 

8*95 

9-1894 

6125-3338 

0*0007 

0*8108 

105 

•96 

•1988 

6372-4452 

•0007 

•8012 

•04 

*97 

•2083 

6630-1886 

•0006 

•7917 

•03 

•98 

•2177 

6899-0468 

•0006 

•7823 

-02 

•99 

•2272 

7179-5252 

•0006 

•7728 

•Ol 

900 

9-2367 

7472-1536 

0-0006 

0-7633 

1*00 


l/Z 

Range 

Z*/PQ 

Weighting 

coefficient 

Y-P/Z 

Minimum 

working 

probit 

Y 

Expected 

probit 



201 


D. J. Finney and W. L. Stevens 

The proportion killed is p — 119/281 *■ 0*4235. 

For Y — 4*61, the Table shows the minimum working probit and range as 

I'm*. = 3*6680, 

and ljZ = 2*7047. 

Henoe the working probit is Y = 3*0680 + 2* 7047p 

= 4*8134. 

Alternatively, if survivors instead of deaths have been recorded, calculation may proceed 

from Y m „ = 6*3727, 

giving y = 6*3727 - 2*7047 x 0*5765 

= 4*8134. 

The Table shows w = 0*6023, 

so that the weight for the observation is 

nw — 281 x 0*6023 
= 169*2. 


REFERENCES 

Buss, C. I. (1935a). The calculation of the dosage -mortality curve. Ann. Appl. Biol . 22, 134-07. 
Buss, C. I. (19356). The comparison of dosage mortality data. Ann. Appl. Biol. 22, 307-33. 

Buss, C. I. (1938). The determination of dosage -mortality curves from small numbers. Quart . J. 
Pharm. 11, 192-216. 

Finney, D. J. (1947 a). Probit Analysis : A Statistical Treatment of the Sigmoid Response Curve. London : 
Cambridge University Press. 

Finney, D. J. (19476). The principles of biological assay. J . Roy. Statist. Soc. Suppl. 9, 40-91. 
Fisher, R. A. (1936). Appendix to Buss, C. I.: The case of zero survivors. Ann. Appl. Biol. 22, 
104-5. 

Fisher, R. A. & Yates, F. (1943). Statistical Tables for Biological , Agricultural and Medical Research 
(2nd ed.). Edinburgh: Oliver and Boyd. 

Gaddttm, J. H. (1933). Reports on biological standards. III. Methods of biological assay depending 
on a quanta! response. Spec. Rep. Set. Med. Res. Coun. f Lond., no. 183. 



[ 202 ] 


MISCELLANEA 

A note on the x* smooth test 

By H. L. SEAL 


The now-classical test of goodness of fit of a theoretical frequency distribution to a set of observations 

n 

consists of the calculation of 2 (dmw) l /m i (where m t is the expectation of thejth group of observations 

and dm s the deviation of the actual number of observations in this group from its expectation) and 
reference to tables of yf with n — 1 degrees of freedom. This is under the assumption that the only restraint 

n 

on the theoretical frequency curve is that 2 Sm s = 0. In her recent contribution to this Journal , 

F. N. David (1947) shows by a geometrical method that, under these circumstances, a test of the signi- 
ficance of sequences of like signs in = 1, 2, . . ., n) is, for practical purposes, stochastically independent 

of the #*-test. The restriction of the arguments to cover only the case of a single linear restraint is, 
however, unnecessary. 

It is possible, in' this connexion, to state a useful general theorem: If Xj{j = 1, 2, ...,n) are n random 

variables normally distributed about zero mean with unit variance , these variables being connected by means 

n 

of k linear relations , the probability distribution of q* = 2 x] remains unaltered if we select only those samples 

in which the signs of follow a specified pattern . 

The proof is immediate since 


p{x lf x %9 ,..,x n } = 


2 *(»-fc)-i r [J(n — &)] 


F(u lt u t u n ^) 9 


where x f = qu f j = 1, 2, . . ., n. 



and the u f {j = 1, 2, ...,n— 1) are connected by k further relations (cp. Hald <fe Rasch, 1943). Thiis, changes 
in the Signs of the x'& are reflected by changes only in the signs of the u* s, the joint distribution of which 
is independent of that of q. 

Naturally this theorem does not apply directly to the case considered by David since the ^*-test is 
there only an approximation to a sot of terms of a multinomial expansion. It is, moreover, easily seen 
that if the squares of n independent random variables, with the same arbitrary skew distribution law, are 
added, the probability distribution of the resulting sum will necessarily depend on the signs of these 
variables. However, the practical use of the ^ a -test as an approximation implies ‘near independence* of 
the test for sequences of signs even when k parameters (instead of one) have been fitted and have reduced 
the degrees of freedom to n — k. 

A further point may be made. David’s method of combining the ^*-test of goodness of fit with the 
sequence test envisages a frequency distribution = 1,2, ...,n) whore n is relatively small: in fact 
her tables for the application of the combined test are calculated for n = 5(1) 14. There is, however, 
a closely analogous case where n may assume a value between about 30 and 75, namely in testing the 
efficacy of the graduation of a mortality table. The writer (1943) suggested that in this case a good test 
would consist of the application of a ^ a -test of goodness of fit with n — k degrees of freedom and a sequence 
test in the form of a 2 x 2 contingency table and associated x * value, as indicated by Stevens (1939) 
himself when he suggested this latter test. In answer to an inquiry, the writer said he was ‘not sure of 
the complete independence of the tests * but in view of the preceding it may be said that the second X* 
(one degree of freedom) is very nearly independent of the x 1 value in the mAin test, so that the two can 
be conveniently combined by addition into one value with n — k+1 degrees of freodom and a useful 
probability judgement of ‘smooth fit* obtained. 


REFERENCES 

David, F. N. (1947). A x % ‘smooth’ test for goodness of fit. Biometrika , 34, 299. 

Hald, A. Sc Rasch, G. (1943). Nogle Anvendelser af Transformationsmetoden i den normale Fordelings 
Teori. Festskrift til Prof. J. F. Steffensen. Copenhagen. 

Seal, H. L. (1943). Tests of a mortality table graduation. J. Inst . Actu . 71, 5. 

Stevens, W. L. (1939). Distribution of groups in a sequence of alternatives. Ann . Eugen. t Lond. f 9, 10. 



Miscellanea 


203 


Rank correlation and product-moment correlation 


By P. A. P. MORAN, Institute of Statistics, Oxford University 


The sampling distribution of Spearman’s coefficient of rank correlation, p 99 has been thoroughly studied 
in the case where every permutation of the ranks of one variate relative to another is equiprobable and 
more recently HOffding (1948) has shown that it tends to normality for large samples whatever the 
parent population. This note deals with the distribution, when the parent is normal, for any size of 
sample. 

Let (x l9 y L ) ... (a?*, y u ) be a sample of n pairs of values from a bivariate normal population with correlation 
coefficient p . To obtain p $ we replace x v ... t x n and y v ...,t/ n by their respective ranks and calculate the 
product moment correlation coefficient of the ranks as if they were variate values. We have then to 
discuss the distribution of p 9 when p is known. 

Consider first the expected value of p 9 . Let p(x i ) 9 p(y i ) be the ranks of x { and y 4 . Write 

H(t) ss 0 for 


as 1 e>o. 


Then p(^) — 1 = H,H(z i —x j ). (1) 

i-l 

p 9 will be the correlation coefficient of the numbers {pfa) — 1} and (p(y<) — 1}. If we had defined H(t) in 
such a way that H( 0) = 1 we would have obtained p(x i ) on the left-hand side of equation (1) but this 
greatly complicates later calculations. Now write 

s= S (?(*<)- l)(p(y,)-l). 

i-l 

ft 

It is easy to verify that £ {pfa) — 1} = Jn(n— 1), 

i-l 

and £ {p(x ( ) - p }• = t^n(n* - 1 ), 

i-l 


where p is the mean of the ranks. 
It follows that 


«-frt(n-l)» 
Pt 1 ) ’ 


and to find E(p 9 ) it is enough to find E(s). Now 

* = 2 2 2 H(x { — Xj) H(y t — y t ). 

i-lj-l fc-l 

The terms in this expansion for which i = j or i = k are zero. There remain only two cases to consider. 
Case /. i 9 j 9 k all distinct. Then x 4 — x i9 y< - y k are distributed in a bivariate normal distribution with 
correlation coefficient equal to |p. E{H(x i -x i )H(y i -y 1c )} will therefore be the chance that both these 
quantities are positive, that is, the integral of the probability density over the positive quadrant. This is 
known (Sheppard, 1898) to be equal to ${1 -ir* 1 cos* 1 Jp}. There are clearly n(n- l)(n~2) such terms. 

Case II. = k. Then x { — x i and - y k have a correlation coefficient p and the expectation of each 
term is |{1 — jr^cos^p}. The number of such terms is n(n — 1). 

It follows that 

E(s ) = in(n - 1 ) (n - 2) ( 1 - tt* 1 cos-* 1 \p) + *n(n - 1 ) ( 1 - tt* 1 cos- 1 p) 9 
12 

and so E(p 9 \ = — — {\n(n - 1 ) (n - 2) ( 1 - zr 1 cos-* 1 \p) 

n(n* — l) 

+ i n(n - 1 ) ( 1 - n- 1 cos- 1 p) - Jn(n - 1 )*} 


6fn~2 
n (n-f 1 


sin* 1 ip + 


1 

n-f 1 


sin^p 


( 2 ) 


For p as 0, 1 this equals zero or unity as we would expect. Equation (2) may be compared with 
K. Pearson’s approximate formula (1907) for turning rank correlation coefficients into product moment 
correlation coefficients, which he derived by using the correlation of grades. This is 


6 



204 


MUuUomea 


Equation (2) tends to this as n increases, showing that p 9 is not only a biased estimator of p, but also 
an inconsistent one. This, of course, does not prevent the use of p % in estimating />. 

The table which follows is a table of E(p 9 ) as given by (2) for n s= 5, 10, 20 and oo, p » 0(0*1) 0*9. The 
lower part of the table, for comparison, shows the expected values of r, the sample product-moment 
correlation coefficient, for the same values. These were taken from the tables in Soper, etc. (1917). 
Interpolation for E(p 9 ) is linear in (n + 1) -1 and nearly linear in n~ x . 


mm 

0*1 

0*2 

0*3 

0*4 

0*5 

0*6 

0*7 

0*8 

0*9 

E(P.) n- 5 


0*1597 






0*6881 


n= 10 


0*1741 

0*2620 



0*5349 




n = 20 


0*1823 








n = oo 


0*1913 




0*5819 




E(r) n = 5 

0*0884 

0*1773 

0*2671 

0*3584 

0*4517 

0*5480 



0*8687 

n= 10 

0*0946 

0*1896 

0*2850 

0*3813 

0-4787 

0*5776 




n = 20 

0*0974 

0*1950 

0*2928 

0*3911 

0*4900 

0-5896 


0*7919 



Before considering the calculation of var (p 9 ), which we can find from var(s) and so from E(a % ) % we 
must consider the problem of evaluating the total probability in the positive part of a quadrivariate 
normal distribution. 

Suppose that x l9 x v x 9f x A are distributed in a quadrivariate normal distribution with correlation 
matrix 


We write 


i 



/ 1 

Pit Pit 

Pu 


/ 

1 Ptt 

Pu 



1 

Pu 


\ 


1 

Pit 

pi* \ 



Pit 

Pu 1 

= pr{x l > 0, x t 


Pu / 




,# 4 > 0 }. 


This will be independent of the variances of the x'a which we suppose all equal to unity. Then the 
characteristic function of the distribution is 

0(^i» U) = expi-lfi-itZ-ift-lft-Pithtt-Puhh-Puhti-PtaVi-PuWA-PnAWib 


and we then have 


( Pu Pit Pit \ 

p "::) 


Now 


j poo poo p qp poo poo poo poo pao 4 

= 7^-: dx 1 dx t dx t dx l I I 

loir* JqJoJqjq j — oo J -»J — ooj — oo l 

= expUs?) 1 HH S (^^'^'p'MiPLpLpli 

l 1 / z— 0m-0n-0p-0 tf-0r-0 

JXmXm * 

X $5l*9 * 1 w *4 ' f 


l\m\n\p\q\r\ 


and we can therefore write 


T { 

f Pit Pit 
Pu 

Pu 

Pu 

Pu 

where 

^■Imnptr — 

and 


o. 


)■ 


2 S 2 £ E S AymnvqrPltPuPuPuPuPu* 

i— 0 m— 0 n— 0 p— 0 <z— 0 r— 0 


l\m\n\p\q\r\ 














Now 


Miscellanea 


“(-»)*(<&) (2?r)i 6_ ** ' 

From this it follows that for a = 0, Q a = J. Consider a> 0. Then 

■(^r'Ldr^L 


as 0 


when * is even 


(2m)! 


= ^ 2 ir^i when * is odd and equal to 2m+ 1, where m = 0, 1, 2, 

Our final result is therefore 
Pit Pit Pit 


( Pit Pit Pit \ 00 00 00 

Pn j 


206 


SSS j ( ~ i y +m+ ' ,+P+t+r °^nO t+ , + .O n+9+ rOn^ 

Op— o g—0r—0 l\m\n\p\q\r\ 

fAtPTtPiipZtPliPi* 




+ 4^j5 {Pi*P** + PitPu + Pi*Pm) + •• 9 

but not all the terms are positive. For example, the coefficient of 

PitPitPiS = m = n = l,p = 5 = r = 0) is 


1 

47T 1 * 


(3) 


Expressions similar to the above have also been given by M. G. Kendall (1941, 1945) and have also 
been given in the lecture courses of Prof. A. C. Aitken as is stated in Kendall (1941). Kendall's version 
of the formula for four variables needs to be interpreted in the light of an earlier comment that the 
leading term is conventionally determined and in any case omits a power of — 1. 

We now consider the problem of finding var (p a ). Since 


var(p,) s 


144 

n*(n # — l) 1 


var (8) 


it will be sufficient to find var(s) = E(a*) — [E(8)]*. Now 


«• = SEE EE E H(x i -x i )H(y i ^y k )H(x p -x q )H(y p -y r ). 
Terms in this expansion for which ♦ = j 9 i = k f p « q or p = r are zero. If we write 


Xi — a?* — Xj, Xj — *2Ci — X 4 — y 9 —y r9 


(4) 


the quantities X lf X t , X 8 , X 4 will be distributed in a quadrivariate normal distribution and the chance 
that they are all positive will be 


i 


ip{ 1 + &ik) l(^*p “ — ^*p + 

ip(&i» ~~ &iq £*p + £*g) 


iPi^ip ~ ~ ^p + ^r) 

i(^*p ~ ^*p + ^*r) 


)■ 


( 5 ) 


iP( 1 + 8qr) 

where & u is the Kronecker d equal to unity if i = j and zero otherwise. Using (3) we can now evaluate 


227(11(3, - *,) H(y t - y k ) H(x p - x q ) H(y p - y r )} - pr{*i > 0, X t > 0, X 8 > 0, X 4 > 0), 


and we must calculate (5), using (3), for each type of set of suffixes in the terms of (4). The number of 
times each such type of term occurs in (4) will be n(n — 1) ... (n — $4- 1), where a is the number of distinct 
suffixes. From a theoretical point of view this solves the problem but in practice, since there are a fairly 
large number of different terms which arise from the various ways of identifying suffixes in (4), it is 
probably only barely possible to calculate var (p t ) for any given value of p. It is certainly not practical 
as a routine. 



206 


Miscellanea 


The above investigation may be oompared with the similar investigations on the sampling distribution 
of Kendall’s r (Greiner, 1009; Essoher, 1924; Kendall, 1948) and the relative simplicity of the latter, 
especially in the formulae for the variance, brings out once again the superiority of r as a measure of 
rank correlation. 

REFERENCES 

Essoher, F. (1924). On a method of determining correlation from the ranks of variables. Skand . 
AktuarTidskr . 7, 201. 

Greiner, E. (1909). tfber das Fehlersystem der Kollekti v -masslehre . Z. Math . Phys. 57, 225. 
HdFFDiNG, W. (1948). A class of statistics with asymptotically normal distributions. Ann . Math. 
Statist. (In the Press.) 

Kendall, M. G. (1941). Proof of relations connected with the tetrachorie series and its generalization. 
Biometrika , 32, 196. 

Kendall, M. G. (1943). The Advanced Theory of Statistics, vol. 1 (3rd ed. 1947). Charles Griffin and Co. 
Kendall, M. G. (1945). On the analysis of oscillatory time-series. J.B. Statist. Soc . 108 , 93. 
Kendall, M. G. (1948). Rank Correlation Methods . London: Charles Griffin & Co. (In the Press.) 
Pearson, K. (1907). Mathematical contributions to the theory of evolution. XVI. On further methods 
of determining correlation. Drap. Co. Mem . biom. Ser . Cambridge University Press. 

Sheppard, W. F. (1898). On the application of the theory of error to cases of normal distributions and 
normal correlations. Philos. Trans. A, 192, 101. 

Soper, H. E. and others (1917). On the distribution of the correlation coefficient in small samples. 
Biometrika , 11, 328. 


Tests of significance in the variate difference method 

By N. L. JOHNSON 


1. The variate difference method has been discussed by many writers, and in particular by Yule ( 1921), 
Anderson (1923, 1926, 1927) and Tintner (1940). It is a method of isolating the random part of certain 
types of time series and is briefly described in § 2 below. 


2. Let u g , . . . , u n be successive observations at equal intervals of time, forming a time series. Suppose 


that it is reasonable to assume that 




( 1 ) 


where (i) f t is a * smooth trend’ such that for some integer K (and hence for all larger integers) 


A*/,= 0; 


(ii) z* is a ‘random residual’ with expected value zero and standard deviation <r, indopondont of t; 

(iii) z lf z t , ...,z„ are mutually independent. 

Under these conditions, if the series (u t ) be differenced K times, the f t terms will be eliminated and, 

A K u t = A K z t . (2) 

Furthermore A* +1 «* = A** 1 ^, A^+^u* = A* +*z t and so on. 

Since these relations hold, we have, for k^K, 


E( A**,) = 0, (3-1) 

E[{A*wJ»] = **(7*<r*. (3-2) 


Thus, if k>K S k = [(n-k) { A*u,}« ' (4) 

<~l 

is an unbiased estimate of o’*. 

If A f t = 0, i.e. if the original series is random 

£ 0 = (n-l)-* S (u*-E)* (5) 

t-l 

is an unbiased estimate of cr *. 

The sequence of statistics S Qf S v S M ... is computed from the observed values u lf u v S k being 

defined by (4) for k> 1 and by (5) for k = 0. This sequence may be used as an indicator of the order 
of difference at which the non-random elements are first eliminated. Generally the original series is not 



Miscellanea 


207 


random and so 8 t is usually considerably smaller than S 0 . As ib increases the ratio /S* +1 /5 t should approach 
unity. The value of Tc at and after which S k+1 /S k stays sufficiently dose to unity is taken as the order of 
difference necessary to eliminate the non-random terras/,. 

3. In order to test precisely whether the ratio S M /S k is in fact ‘ sufficiently dose ’ to unity it is necessary 
to assume a form of distribution for the z,’s. It is supposed that the assumption of a normal form of 
distribution for the z% s will not give rise to serious error. Even with this assumption it is very difficult 
to obtain exact significance levels for the ratio S k+l /S k , owing to the existence of correlations of various 
degrees of intensity between the differences involved. 

Tintner suggested a method of overcoming this drawback. He proposed that certain sets of the fcth 
and (ife-f l)th differences should be selected in such a way that any selected difference of either order 
should be unoorrelated with any other selected difference of either order. The sets to be selected were of 
the types (A*u,) r = t> * + (2 jfe + 3), t+2[2k + 3), .... t+U~l)(2k + 3) t 

(A *+««,) 8 = t + fc+l, t + k + l + (2k+3), .... t+k+l + (j-l)(2k+3). 


t may have any integral value from 1 to (2k 4* 3) inclusive, each value giving rise to a different selection; 
j has the largest value possible. In any one* selection none of the quantities AX> A*+X have a u t in 
common, and their independence is thereby assured. If the non-random elements liave been removed 
by taking A?th differences, then (A*!*,.), (A *+X) should be sequences of independent normal variables, 
each with expected value zero and with variances in the ratio 


= J{fc+ l)l(2k+ 1). 

Tintner suggests using as test criterion 

£{AV } * 2(2k+l) 

F = J, ■ . - 7 ? — . or * = jlog,F. 


’ {A* +1 w,}* k+1 


(«) 


Standard tables of F or z, entered with degrees of freedom.;, / would then provide exact significance limits 
for the suggested criteria. 

4. Although this method of selection gives a test which is exact, provided the assumptions upon which 
it is based are correct, it involves the sacrifice of a considerable proportion of the data available. Thus 
only about one out of each {2k + 3) members of the two difference columns under comparison is used 
explicitly in the test criteria. This is necessary if the correlation between any two selected differences is 
to be zero. 

It is possible, however, to make use of somewhat more of the data available, still subjecting the results 
to a fairly simple exact tost, analogous to that proposed by Tintner. Consider, in fact, the modified 
method of selection leading to the sequence of pairs of differences 

(A\,A*% r ) r = *, « + * + 2, * + 2(& + 2)..., *+(/- 1) (k + 2). (7) 

Here t may be any integer from 1 to (fc + 2) inclusive, and / has its largest possible value. 

If non -random elements have been eliminated in the kth. differences then 

(i) the correlation between corresponding differences A X and A * + X is — [%(2k + l)/(fc-f 1)]*; 

(ii) any difference of one order is correlated only with the corresponding difference of the other order; 

(iii) the expected values and ratio of variances of the sequences (AX)> (A* + X) are the same as in 
Tintner’ s method. 

An exact test based on the selection ( 7 ) would utilize about twice as many differences as would Tintner’s 
test for the same time series. For example, we have the following comparison between Tintner’s method 
of selection and method (7) in the special case t = 1, jfe = 2. 

Tintner’s Method Present Method (7) 

A X AX AX AX 

AX = t*, — 2u, + u 1 AX = 3 u*-u 4 AX = Uj-2w g + Uj AX = w 4 — 3u 8 -f 3w a - u x 

AX = u io ~ 2w» + u 8 AXi = u u — - u n AX = — 2u 4 + u 6 A X = u s ~ X + 3w a — u s 

etc. etc. etc. etc. 

Variance 0<r f 20<r* Ccr* 200** 

Correlation 0 | — ^(5/6) 

The gain in efficient use of the data would not, however, be as great as this would suggest, since the 
correlation is high between the differences already in Tintner’s selection and the additional differences 
introduced in (7). Nevertheless, a oertain amount of extra information would be used in a test based 
on (7). Such an exact test, analogous to Tintner’s test, is derived in S 5. 



208 


Miscellanea 


5. In the ease of Tintner’s method of selection, the hypothesis whioh should be tested is that the 
expected values of eaoh of the differences is zero. The ratio of the variances of the differences in the two 
sequences is known. The alternative hypotheses specify non-zero values of the expected values, the ratio 
of the variances remaining unchanged. It is not necessary that these non-zero expected values be 
constant for either of the sequences of differences involved. Tintner’s criterion is, however, appropriate 
to the case where the alternative hypotheses specify different values for the ratio of the variances, the 
expected values remaining at zero (i.e. the hypothesis tested is that the variances are in a certain ratio, 
it being assumed that the expected values are zero). This system of hypotheses seems to be a reasonable 
approximation to the situation, as may be appreciated by regarding the differences A%,A* +1 /r> 8/6 
random variables. 

A similar approach in the case of selection (7) leads to the conclusion that the hypothesis to be tested 
may be stated as follows: * The variances of the two sequences of differences are in a certain ratio, and the 
correlation between corresponding differences has a certain value. It is assumed that all expected values 
are zero.’ The alternative hypotheses specify other values for the ratio of the variances and the, 
correlation, all expected values being supposed to remain at zero. 

A test appropriate to this situation may be obtained by a slight modification of a result due to Hsu 
( 1940), who used Neyman and Pearson’s likelihood ratio method to derive a number of tests of hypotheses 
regarding two normally correlated variables. The hypothesis to be tested is, in fact, nearly the same as 
Hsu’s hypothesis H A and the appropriate test criterion is similar to his criterion L v This similarity is not, 
however, apparent when the special symbols and values of the present problem are inserted, giving 


a test criterion 


L = 


T ktk T k+lt fc+l 


-n 


*, k+1 


<2* + 1 ) [T kt k + i(k + 1 ) (2* + 1 J- 1 T k+lt M + T k 9 


(S) 


where T 9t9 = £ AH^AHj*.. 
r 

L must lie between 0 and 1 . Low values of L are regarded as significant of departure from the hypothesis 
tested, indicating that the ‘non-random’ terms have not been eliminated in the fcth differences. If 
non-random variations are in fact absent from the fcth differences, the probability density function of 

£ “ p(L) = Hj’-l)LW-*> (0*ZL*ZI). (9) 


It will be recalled that j* is the number of differences in each of the sequences (AH*,), (A*+H* r ). From (9) 
it follows that L a , the significance limit corresponding to a probability a of rejecting the hypothesis 
when it is valid, satisfies the equation LW-v ~ a# 


i*e. log 10 L a = (2 log 10 a)/0 v - 1 ). ( 10) 

If log 10 L be taken as test criterion, the significance limits may be calculated very easily by means of 
(10). Wenote that the 5 % limit for log 10 Z, is - 2-6 /(/- 1), and the 1 % limit is -4-0/C;'- !)• 

6. We shall now apply selection (7) to the series of American wheat-flour prices for the years 1890“ 
1937, which is analysed by Tintner. There are forty -eight observations in this series. Taking t = 1, and 
comparing the first and second order differences (i.e. k = 1) we have the sequences 

At*!, A t* 4 , Auy, ..., Au 46 ; 

AH*!, AH* 4 , AH*^ ..., AH*^; 

so thaty' = 16. (It may be noted that the sequences of differences in Tintner’s selection l- A eaoh contain 
9 members.) The sums of squares and products are 

T ltl = 31-269830, T h9 = - 23*842300, = 29-137173, 

whence L = 0*3889, 

and log 10 L = - 0-4102. 

For j' = 16, the 1 % limit for log 10 L is -0-27 and the 6 % limit is — 0- 17. The calculated value of log 10 L 
is less than the 1 % limit, and it seems unlikely that the non-random terms have been eliminated in the 
first differences. 

Now taking k = 2 (and f = 1 as before) we have the sequences of differences 

AH*x, AH* 5 , AH* t , ..., AH*4 5 ; AH*i, AH* # , AH**, ..., A H* 48 ; 


so that in this cssej' = 12. The sums of squares and products are 

T t , = 12-030127; = - 10-330714; = 63-809942 

L = 0-3396, 
logic ^ = “ 0-4690* 


whence 

and 



Miscellanea 


209 


For/ as 12, the 1 % limit for log 10 L is — 0*86 and the 5 % limit is — 0*24. Again it seems unlikely that 
non-random terms have been eliminated. 

Comparing third and fourth order differences (k = 3) we have, using the selection corresponding to 
* = 1 T M = 102-200898, T M = - 188-015311, . T 4it = 358-329689, 

giving L =s 0*5862 and log 10 L = — 0*2320. The appropriate 1 % limit ( j ' = 9) is ~ 0*50 and the 5 % limit 
is — 0*33. This test supports the hypothesis that non-random terms have been eliminated in the third 
order differences. Before a final decision is reached, of course, further tests would be made, using other 
possible values of t % and dealing with higher orders of differences. The above calculations should, however, 
be sufficient to make clear the mode of application of the method of selection (7) and its associated test 
of significance. 

7. It is not essential that the sequence of pairs of differences be chosen as in (7) above. Any sequence 
satisfying conditions (ii) and (iii) of § 4may be used, and a corresponding criterion, similar to (8), obtained 
by inserting the appropriate terms in the T 9% Jb and using the oorrect value of the correlation coefficient 
between corresponding differences. The sequence 

(A'u^AV) r = t+k + 2, t+2(k + 2), ..., * + (/-l)(fc + 2) 

is very similar to (7) and leads, in fact, to identically the same criterion. 

Other alternative methods of selection may aim at reducing the correlation between corresponding 
differences. For example, in the sequence 

r = $, t + k + 2+p, t+2(k+ 2+p), ..., 

this correlation is 


( — 1)* +1 n+1 C 9+M l( ik 0 h « <-!)*+* 


*(Jb-l)...(Jfc~p+l) 

(& + 2)(fc+3)...(*+p+l) 



Unfortunately the more the correlation is reduced the greater the proportion of data eliminated by this 
method of selection. 

A further possible method, possessing the advantage of symmetry, could be based on the comparison 
of Jfcth and (fc + 2)th order differences, using the sequence 


(A*u r+ i i A»+*u r ) t + k + Z, t + 2(fc + 3),..., 

in which the correlation between corresponding differences is 

- [(2k + !)(* + 2)/(2& + 3) (k + 1)]*. 


REFERENCES 

Anderson, O. (1923). Biomehrika , 15, 134. 

Anderson, O. (1926). Biometrika, 18, 293. 

Anderson, O. (1927). Biometrika , 19, 53. 

Hsu, C. T. (1940). Ann. Math . Statist. 11, 410. 

Tintner, G. (1940). The Variate Difference Method. Cowlqs Commission, Principia Press, Bloomington, 
Ind. 

Yule, G. U. (1921), J. Roy. Statist. Soc. 84, 497. 



[ 210 3 


REVIEW 

Th* Advanced Theory of Statistic*. Vol. n, pp. 1-521. By M. G. Kendall, ML A. 
London: Griffin and Co., Ltd. 1946. Price: 50$. 

There is only a small measure of agreement among writers on the theory of statistios as to what should 
constitute an advanced course in the subject. There is no common view about either the nature of the 
problems which should be discussed or the methods by whieh they should be solved. Disagreement 
about the actual details of solutions of theoretical problems is, perhaps, only a passing phase and its 
importance may be exaggerated. The way in which the problems themselves are posed, however, 
reveals more fundamental cleavages of opinion which may not be so easily reconciled. 

An important element which renders difficult the formulation of theoretical problems is the uncertain 
relation existing between the theory and the practice of the application of statistical methods, as soon 
as we advance beyond the most elementary stages. In discussing this relation it is convenient to 
distinguish between the theoretician or, as he is now categorized, the mathematical statistician, on the 
one hand, and the practical statistician, using statistical methods in some particular field of inquiry, 
on the other. This distinction may have some validity if it is not pushed too hard. It is true that at 
one extreme there is a body of workers whose interest in the theory of statistics is primarily as a source 
of mathematical problems. And, at the other extreme, there are experimentalists who apply statistical 
methods by rule of thumb, without much concern for the reasoning which is needed to justify them. 
But, although the modem tendencies in the organization of all the sciences may be such as to force 
workers into extreme positions and label them accordingly, it must be recognized that, in statistios, 
the theoretical and the practical investigator are still, fortunately, often one and the same person. 

This fact undoubtedly makes for a closer integration of theoretical developments with practical 
usages, but not, perhaps, to the extent that one might imagine. It is in the nature of theoretical work — 
almost in its definition — that, whoever may be responsible for it, it should assert the right to an 
independent existence of its own and refuse to be tied down too closely by considerations of its ultimate 
usefulness. Even were statistical theory to be cultivated exclusively by writers who at the same time 
were outstandingly competent in dealing with the more commonplace, everyday, statistical investiga- 
tions, it would still not escape the tendency to become a purely abstract discipline. The time spent on 
it would still have to be justified by secondary considerations, such as its aesthetic appeal, educational 
value and the like, as well as by its ability to provide novel statistical procedures of practical value in 
the experimental and social sciences. 

The tendency of statistical theory towards exclusive concern with abstract generalizations is helped, 
of course, by those mathematicians who are interested in nothing else. There is, however, as one might 
expect, a strong reaction against it from statisticians who still maintain a wider outlook. This reaction 
expresses itself, sometimes explosively, in the form of exasperated criticism of this or that piece of 
theoretical work on the grounds, either of its irrelevance, or worse, of the misleading impressions which 
it may give of the true objects of practical statistical investigations. The criticism is not always fair 
or well directed. Its authors are seldom immune from the failings, real or imaginary, which they castigate 
in others. But it is sufficient to show that statisticians as a body, however pleased they may be that 
their subject is gaining increased academic recognition, are yet unwilling to sanction the appearance 
of two separate subjects — pure and applied statistics — which can develop without much reference 
one to the other. Practical interests are still able to make themselves felt even in academic develop- 
ments. Further, since the practical interests of statisticians are so diverse, it is not surprising that 
a parallel difficulty is found in reaching agreement as to what is the proper subject-matter for the 
advanced theory of statistics. 

Mr M. G. Kendall, in the notable work of which the second volume is here under review, does not 
attempt to confine the subject within any very strict limits, nor to pres& upon the reader any very 
decided views as to what should constitute its most important features. His method is rather to let 
the truth appear — if appear it can — by making an extensive and detailed survey of the whole range 
covered by modem contributions. 

The first four chapters of this second volume are concerned with the fundamental problem of 
estimating population parameters from sample values and of assigning limits to such parameters on 
a probability basis. This has in the past produced a most controversial literature, revolving round the 
question of the applicability or otherwise of the theory of inverse probability and of the various 
alternatives to this theory which have been proposed. Mr Kendall devotes most attention to the 
approach which does not make explicit use of the conoept of inverse probability. Some readers might 



Review 


211 

consider that ha should have given more space to the alternative position which is founded on Bayes's 
Theorem and its corollaries. However, in a sense, Harold Jeffreys’s excellent restatement of this position 
in his Theory of Probability renders a lengthy description of it unnecessary, unless, indeed, Mr Kendall's 
object had been to make a critical comparison and evaluation of the inverse and direct probability 
approaches. Mr Kendall is content to describe rather than to judge. One might, perhaps, criticize him 
in this section of the book for carrying his non-committal attitude too far in the face of the irreconcilable 
viewpoints which emerge from the discussion. He could, possibly, have pressed his own ideas upon us 
more strongly in certain passages, without departing from the high standard of fairness which he 
maintains throughout in representing the ideas of his statistical colleagues. 

The problem of scientific inference discussed in these first four chapters, although a fertile source of 
argument, is, strangely enough, one which, in whatever way it is settled, does not appear to influence 
muoh the actual way in which scientific investigation is carried out in practice. The next five chapters 
are concerned with a miscellaneous range of questions, which, while not having the same attraction for 
the pure theorist, are more closely related to those statistical methods used most frequently in the 
everyday interpretation of experimental data. They include, for instance, some discijssion of the theory 
of the analysis of variance and of the role which randomization plays in ensuring the applicability of 
this theory in practice. The attitude which one adopts towards the theoretical questions discussed in 
this port of the book can affect quite directly the choice between the alternative experimental 
procedures which may be at one’s disposal in some specific inquiry. 

The succeeding two chapters revert to a more generalized treatment following the same lines as 
the earlier ones. They are again concerned with the fundamental problem of scientific inference, although 
now from the restricted angle of significance testing. They furnish another illustration of the author’s 
remarkable ability to reproduce the spirit as well as the substance of the original contributions covered 
by his survey. They again suffer, perhaps, from some lack of critical evaluation on the author’s own part. 

The next chapter deals with multivariate analysis, including discriminant functions. From the 
theoretical standpoint this is a straightforward development from the univariate and bivariate cases. 
Some new problems are raised, but much is simply generalization. Of the possible applications of these 
recent developments it is difficult to speak confidently. It is cleeur that in many fields multivariate 
analysis of the type here described can only proceed on the basis of many dubious assumptions as to 
the nature of possible connexions between the variates. Moreover, simple interpretation of the results 
of a multivariate analysis in terras which convey muoh to the layman is difficult. On the other hand, 
in certain cases where a very large number of variates are measured (e.g. in intelligence testing) some 
logical way of dealing with the results is needed, and one may in some instances be sure enough of one’s 
assumptions to exploit this recent theoretical work. 

The two final chapters deal with Time Series, a subject perhaps most studied by economists, but not 
one which has borne muoh fruit. Trade cycles have been sufficiently talked about, but there has been 
little analysis of data demonstrating their existence in an unequivocal fashion. Mr Kendall is the 
foremost representative of a school of thought which holds that the search for regular periodicities in 
economic data has been largely a waste of time, and that muoh more is to be hoped for by considering 
so-called autoregressive schemes, which allow for irregularities in the lengths of periods — albeit 
irregularities governed by some simple law. It is too soon yet to say whether the new conceptions will 
be much more successful than the old, but whatever success is obtained will be largely due to 
Mr Kendall’s own efforts to clarify the subject. 

Looking back after reading through the two volumes of Mr Kendall’s ambitious enterprise, one is 
forced to the realization that hero we have one of the most remarkable compilations that has ever 
been attempted by a single writer in any branch of science. If several writers had banded together to 
produce it one would still be impressed by the wide range of topics and viewpoints which it embraces. 
One might almost be pardoned, indeed, for thinking that ‘M. G. Kendall’ was a pseudonym standing 
for the collaboration of several persons. This would, at all events, be a pleasant theory to explain 
the many excellences of the work, particularly its freedom from personal animosities and the generosity 
of its references to the contributions of so many different writers. It would also explain the somewhat 
loose-knit nature of the work judged as a whole. Occasionally one feels the lack of a sufficiently strong 
common thread riinning through it and holding it together ; such as one would perhaps obtain if the 
author were inclined to be more trenchant in his criticisms of what other people write and more 
self-assertive m putting forward his own contributions. But in reply to all possible criticisms, Mr Kendall 
has the final word, when, in his concluding section, he says : ’Much remains to be done; and this book 
will have served its purpose if the reader is left with the desire to do some of it himself.’ Few authors 
oould by implication Lave exposed so clearly just what it is that remains to be done. 


B.L.W. 




December 1948 


Volume XXXV, Parts III and IV 


SOME FURTHER NOTES ON THE USE OF MATRICES 
IN POPULATION MATHEMATICS 

By P. H. LESLIE 

Bureau of Animal Population, Department of Zoological Field Studies, 

Oxford. University 

CONTENTS 

PAGE 

1. Introduction * 213 

2. The stable female birth-rate 217 

3. The biological significance of the row vectors ..... 219 

4. The total reproductive value of a population and the length of a vector 225 

5. The limited type of population growth 227 

0. 'fhe predator-prey relationship between two populations . . . 238 

References ’ 246 

1. Introduction 

.The use of matrices in population mathematics has been discussed in a previous paper 
(Leslie, 1946), and some of the properties of the basic matrix representing a system of age- 
specific fertility and mortality rates have been described both there, and also in an earlier 
paper by Lewis (1942).f The purpose of the following notes is to enlarge .on a few points left 
over from the earlier work, and in the later sections to extend the use of matrices and vectors 
to the case of the logistic type of population growth and to the predator-prey type of relation- 
ship between two or more populations. 

In order to save a troublesome amount of cross-referring, it may perhaps be a convenience 
if the definitions and properties of the basic vectors and matrices are summarized heife, and 
also if a brief account is given of the various transformations which are at one time or another 
used in the theoretical development. For fuller details reference may be made to the appro- 
priate section of the original paper. 

As before, for the sake of simplicity, the female population only will be considered,, and 
the same unit of age will be adopted as that of time. If m to m + 1 is the last age group in the 
... r*+i 

complete life-table distribution defined by L x = J l x dx (taking £ 0 = 1), and we put 

P x ( x = 0, 1,2, ..., w— 1) ~ L x+l jL x = the probability that a female aged x to x+ 1 at time t 

will be alive in the age group x+ 1 to x + 2 at time 

£+ 1 , 

F x (x = 0, 1, 2, = the number of daughters born in the interval t to £+ 1 per female 

alive aged x tQ x + 1 at time £, who will be alive in the age group 
0 to 1 at time f+1, 

t At the time my original paper was published I was not aware that the. same problem had already 
been investigated by Lewis (1942). This author establishes the form of the basic matrix and discusses 
a number of its properties, including the role of the dominant latent root and the form of the stable 
age distribution. He suggests that the rapidity with which an arbitrary age distribution settles down 
to the latter form will depend on the difference between the dominant and subdominant root of the 
characteristic equation* and he also discusses the type of matrix in which there is only a single non-zero 
element in the first row. It is clear, therefore, that unwittingly I covered a good deal of gro und which 
had already been covered by him. I am indebted to Prof. M. S. Bartlett and Dr S. Vajda for this 
reference. 

Biometrics 35 


14 



214 


Matrices in population mathematics 

we are led to consider the square matrix M of order m + l whioh has the F x figures in the first 
row and the P x figures in the subdiagonal immediately below the principal diagonal. For 
many purposes, however, it may not be necessary to deal with the matrix M as a whole. 
Thus, if x = k is the last age group within whioh reproduction occurs, all the F x figures for 
x>k will be zero and the determinant | M | =0. Partitioning the matrix symmetrically at 
this point the principal, non-singular, submatrix is 



As before an arbitrary age distribution will be written as the column vector £, different 
age distributions being distinguished by different subscripts. The number of elements com- 
posing a £ vector may be either m + 1 or k + 1 depending on whether the particular age 
distribution considered is complete, or confined only to the pre-reproductive and repro- 
ductive age groups. Associated with each £ x there is a uniquely determined vector i} x , which in 
matrix notation is written as a row vector, the square of the length of the vector £ x being 
given by the scalar product y x £, x . If the age distribution £ x is complete, consisting of m f 1 
elements, the last m — k elements of the associated vector y x will all be zero. Generally speak- 
ing, however, the post-reproductive age groups can be neglected, more particularly in the 
theoretical development, and unless otherwise stated it will be assumed that we are dealing 
with ij and £ vectors consisting of k + 1 elements which are subject to the system of rates 
represented by the submatrix A. 

It was shown in the previous paper (Leslie, 1945, §5) that it is convenient for many 
purposes to pass to a new frame of reference, the vectors y and £ and the matrix A undergoing 
the non-singular linear transformations 

y = <f>H, Z = H~ 1 \Jr, B = HAH* 1 , 

where H is a diagonal matrix with elements (P a P l P i ...P k _ l ), (P 1 P 2 P 3 ... P k _j), (P k2 P k ~i), 
P k _ x , 1 , which are derived entirely from the life table. (If the matrix M is the subject of the 
transformation instead of A, the matrix H may be suitably enlarged and will include all the 
P x figures down to P m _ x .) It will be noted that in this collineatory transformation the square 
of the length of a vector is an invariant, and that the matrices A and B have the same 
characteristic equation and, therefore, the same latent roots. < 

The effect of this transformation on the elements of A is to replace the P x figures in the 
principal subdiagonal by a series of units, and thus to reduce A to the rational canonical 
form. In biological terms it is equivalent to transforming the original population into one 
in which all the individuals live until the span of reproductive life is completed at the age 
of * = k + 1. This imaginary type of population, with which in many ways it is more con- 
venient to work, might be termed the canonical population. 

When the relation between two column vectors is suoh that 

Bf a - A^ a , 



P. H. Leslie 


215 


where A is a scalar, then ijr a is termed a stable ijr appropriate to the matrix B. Similarly in 
the case of initial row vectors, if <f> B = Xtjt 

then <j> a is a stable <j> appropriate to B. 

It may be shown that corresponding to each distinct latent root A 0 of the characteristic 
equation of B, | B-AI\ = 0, 

there is a pair of stable vectors <f> a and \jr a which in the usual way may be normalized so that 
4*0$ a — !• In the case when all the k + 1 latent roots of B are distinct, the normalized stable 
\Jr form a set of k + 1 independent and mutually orthogonal vectors of unit length, and any 
arbitrary rjr x may be expanded in terms of them, viz. 


fx = c i ^2 + c 3 ^3 + • • • + c*+i 4r k+ 1, 

where the coefficients c a may be either real or complex. Similarly the associated row vector 
<j> x can be expanded in terms of the stable <j), 

4x = C l^l+ + ••• +Ck+l ( f > k+l< 

where c„ is the complex conjugate of c„ in the expansion of i/r x . Similarly, by transforming 
back to the original co-ordinate system, any arbitrary £ x can be expanded in terms of the 
stable £ and its associated vector rj x in terms of the stable rj. 

Since only one of the latent roots, and this the dominant one of the matrix B, is real and 
positive, only one of the stable ^ will consist of real and positive elements. It is this stable 
£i = Il~ l ijr j, associated with the dominant root A x , which is ordinarily referred to as the 
stable age distribution appropriate to a given set of age-specific fertility and mortality rates. 
The relation between the inherent rate of increase (r) and the dominant root of the matrix 
is given by log e A x = r. 

There is one further transformation of the matrix B which is of some theoretical import- 
ance. The expansion of an arbitrary \[r x in terms of the normalized stable \]r may be written 
in matrix notation as ^ q c ^ 


where the columns of the matrix Q consist of the stable i/r arranged from left to right in 
descending order of the moduli of the roots with which they are associated. In the same way 
the expansion of an arbitrary </> x may be written 


where c' x is the transposed complex conjugate of the vector c x , and the rows of the matrix U 
are formed by the stable <f> arranged in a similar order from above down. Since the normalized 
stable vectors have the properties 



(a = b), 
(a + 6), 


it follows that U and Q are reciprocal matrices {UQ = I). In this transformation to an 
orthogonal co-ordinate system the length of a vector remains an invariant and the matrix 
B becomes UB q = UHAH~ l Q = C, 

where C is a diagonal matrix whose elements are the latent roots of B (reduction to classical 
canonical form). 

Since an arbitrary age distribution ifr x = H£ x must necessarily consist of real and posi- 
tive elements, and since ijr x — Qc x ,<j> x = c x U, we have <f> x — i/r' x U’U = \/r x Q, where Q is a 



216 


Matrices in population mathematics 


symmetrical matrix of real elements. Thus, in terms of the original oo-ordinate system, 
since H is a diagonal matrix unaltered by transposition, 

Vx = g' x HOH. 

The matrix HGH, or G if the work is being carried out in terms of the canonical population, 
has the important property of converting a column vector into the associated row veotor. 
The reciprocal relationship is given by 

L - H-'G 'H-Wz, 

where G~ l — QQ'. For further properties of the metric matrix G see the previous paper 
(Leslie, 1945, §11). 

It may perhaps be of interest if the actual values of some of these matrices are given for 
a simple numerical example, which will be used in some of the later sections in order to 
illustrate certain points. Although this example is purely a mathematical model bearing 
no relation to any known species, its properties are the same as those which might be observed 
for a population of living organisms considered in a small number of age groups, and for 
convenience biological terms will be used throughout in interpreting the results obtained 
with this matrix. Suppose, then, we have an entirely imaginary population which can be 
considered in four age groups, and let the life table or stationary age distribution be given 
by the L x values forming the column vector (0-9, 0-7, 0-6, 0-3}. Further let the matrix 


A = 


' 0 45/7 

7/9 0 

0 5/7 

0 0 


18 

0 

0 

3/5 


18 ' 

0 

0 

0 


(1-D 


Then, since H is the diagonal matrix with elements h n = P 0 P 1 P 2 , h 22 — P±P 2 , = P 2 , 

h u = 1, we have 

0 


H = 


1/3 

0 

0 

0 


and 


HAH- 1 = B = 


3/7 
0 

0 0 

0 5 

1 0 
0 1 

Lo o 


3/5 0 
1 


10 6 
0 0 
0 0 
1 0 


The characteristic equation | B — A/ 1 = 0 is, when expanded in powers of A, 

A 4 — 5A 2 — 10A — 6 = 0: 


and the latent roots are therefore A a = 3; A 2 , A 3 = — 1 ±t; A 4 = — 1. In the transformation 
to the classical canonical form UBQ = C, the matrix 




V(68) 


'27 

9 

3 

1 


4-9985 + 6-4019* 4-9985-6-4019* >/( 17 ) 4 " 

0-7017-5-7002* 0-7017 + 5-7002* -^(17)* 

-3-2010 + 2-4992* -3-2010-2-4992* V( 17 )» 

2-8501 + 0-3509* 2-8501 - 0-3509* - V(17) *. 



217 


P. H. Leslie 


u 


V(68) 


1 3 4 2 

2-8501 + 0-3509* '-3-2010 + 2-4992* - 13-6488- 7-4545* -7-4977-9-6029 t 
2-8601 -0-3509* -3-2010- 2-4992* - 13-5488 + 7-4546* -7-4977 + 9-6029* 


and 


-V(17)i 


O = U'U = 


V(17)* 

0-5072 -0-4484 

-0-4484 0-8674 

-2-1539 1-9041 

-2-1982 1-5882 


4.V(17 )* 6.V(17)t 

-2-1539 -2-1982- 

1-9041 1-5882 

11-2688 11-2109 

11-2109 13-4245 


where in each case the elements have been rounded off to the fourth decimal place. 

Since the cth row of the matrix U is the stable vector <j» a which is associated with the stable 
yjr a vector given by the ath column of Q, it is possible to construct readily from their rows 
and columns the set of four matrices S a = ijr a <f> a , which have the properties (Leslie, 1945, § 9) 

Si = S a , S a S b = 0 (a * b), ZS a = I. 

a 

If/(2?) is a polynomial of the matrix B, we have when the latent roots of the matrix are 
distinct, *+i 

/(*>- EMJS*. 

a«l 

Thus Bf = A[8 X + AJ + . . . + A^^Vi, 

so that in the present example, when the matrix B is raised to a high power and Aj is much 


greater than all the remaining A(,, 

27 

81 

108 

54‘ 


9 

27 

36 

18 

& oc 

3 

9 

12 

6 ’ 


1 

3 

4 

2_ 


and hence, by transforming back to the original co-ordinate system, 



'81 

312-4283 

583-2 

480- 

H~ l &H = A' oc 

21 

81-0000 

151-2 

126 

5 

19-2857 

36-0 

30 


1 

3-8571 

7-2 

6_ 


for large values of t. 


2. The stable female birth-rate 

Once the dominant latent root of the matrix lias been found, there is one comparatively simple 
way of calculating the stable age distribution. Thus, working in terms of the canonical 
population and m + 1 age groups, the stable appropriate to the root may be taken pro- 
portional to the column vector {Af*, Af- 1 , A 1? 1}, and by operating on this vector with the 
matrix fT" 1 , the stable age distribution can readily be obtained. The method which was 
used previously for calculating the stable female birth-rate was then to operate on this 
distribution with the maternal frequency figures*)* (m x ) and thus determine the total number 

t The maternal frequency m 9 is the mean number of live daughters bom per unit of time to a female 
aged x to 1. They are the figures tabulated in the usual type of fertility table and are not the same 
as the F k figures forming the first row of the matrix. 



218 


Matrices in population mathematics 


of female births whioh might be expected per unit of time (Leslie, 1945, § 16). Although it 
seems likely that no very great error would be made in employing these methods, both the 
stable age distribution and the stable birth-rate can be defined rather more formally for the 
discontinuous case, and the appropriate equations can be derived for calculating them 
directly when the work is being carried out in terms of discrete age groups. 

Consider at time t a stable age distribution £(<) appropriate to the dominant latent root 
A of the matrix M, and let n x (x = 0, 1 , 2 , . . ., m) be the elements of this column vector. Then 
by the definition of a stable vector 

£(<-*) = A-*£(0. 

If B(t) = the number of daughters born alive in the whole population in the interval of time 
t to t + 1 , it is easily seen that since n x are the number of individuals alive aged x to x + 1 , 

An 0 = L 0 B(t), 

A«j = P 0 n 0 

= L^B{t- 1), 

and in general A n x = L x B(t — x). 

If we put 

n x = the proportion of the stable population alive in the age group x to x + 1 , 
and N(t) = the total number of individuals alive in the stable population at time t \ 

N(t+ 1) • 

Defining the birth-rate /3 = B(t)/N(t), 

we have in the case of the stable population, 

B(t — x) = /JN(t—x) — /3N(t) A~ x , 

so that it x = f}L x A~<*+m, (2-1) 

an expression which defines the matrix stable age distribution. From this it follows, since 

m 

1, 

o 


that 


1 m 

^ = 2 L x A^+». 

P x-0 


(2-2) 


This argument for the oase of discrete age classes is, of course, developed along lines similar 
to those followed by Lotka (e.g. 1939, p. 16) for the continuous case, where, if c x is the 
proportion of the stable population aged between x and x + dx and b the instantaneous 
birth-rate, 


c x = be^*l x 


and 


I _ r 

b "Jo 


er TX l x <Lx, 


(2-3) 


The birth-rate /? as defined by (2-2) is, however, a different type of birth-rate to that 
defined by (2-3). It is the total number of births taking place in the interval of time t to t + 1 
expressed per head of population at time t. If D(t) is the number of deaths occurring in the 
same interval and 8 = D(t}/N(t), 

N(t+l) = N(t) + B(t)-D(t), 


and thus, in the case of the stable population, 

A = l+fi-8. 



P. H. Leslie 


219 


In order to express the relationship between ft and 6, we might consider that in the con- 
tinuous case the number of births occurring during the interval of time t to t + 1 will be 
given by r i 

B(t) = 6A r (<)J e"dr, 


whence 



or, since log e A = r, 


b = 


A— 1 ‘ 


(2-4) 


As an illustration of the comparative results obtained by applying these equations, we 
may take the same imaginary population of Rattus norvegicus as was used previously as 
a numerical example (Leslie, 1945). In the appendix to that paper if was shown that for the 
given system of fertility and mortality rates the value of r, estimated by the more usual 
methods of computation, was 0-44565, and that b = 0-51265, this value of the birth-rate 
being obtained by the numerical integration of (2-3). When the system of rates was expressed 
in the form of a matrix of order 21 x 21, the dominant root was A! = 1-56246, or r = 0-44626, 
and using equation (2-2) ft = 0-64839, and from (2-4) b — 0-5144. The agreement between 
these estimates of the stable birth-rate is reasonably close and suggests that when we have 
already calculated the life table age distribution, which is so often the case, equations (2-2) 
and (2-4) of this section will provide an alternative method of calculating 6, which would 
save a great deal of the tedious labour involved in the numerical integration of (2-3). Although 
theoretically it is necessary to consider the entire age span of the life table in applying these 
equations, this was not done in the present instance. In the numerical example given above 
the value of the rate of increase is so high that the post-reproductive age groups could be 
neglected without any very great error. 

The stable birth-rate and death-rate of the transformed or canonical population (ijr 1 
vector) are perhaps only of academic interest. In this connexion, however, there is a small 
point worth mentioning in order to correct a misstatement which was made in the previous 
paper. In a footnote (p. 208) it was there stated that ‘in the transformed population the 
death-rate = 0’. Strictly speaking this would only be approximately true under certain 
conditions; for, if in the case of the stable canonical population 

A = 1 +ft' — S', 


where dashes are attached to the symbols in order to distinguish them from those used above, 
we have by putting L. x = 1 in (2-2) and carrying out the summation, 

A m+1 (A — 1) 




A m+1 - 1 


and hence 


A- 
A m ^ 1 - 1 


which will approach zero as A m+1 becomes large. Actually in the numerical example given 
in the footnote referred to, the value of A m+1 was sufficiently great for S' to be taken as 
approximately zero without any very serious error being incurred. 


3. The biological significance of the bow vectors 
The columns of the matrix M* are a measure of the contributions made by each age group to 
the total population at time t. Thus, for example, if there were individuals alive in.the age 
group j to j + 1 at t = 0, the number and age distribution of their living descendants and 



220 


Matrices in population mathematics 


survivors at time t could be found by multiplying the elements in the (j + 1 )th column of M* 
by n } , and hence their total contribution to the population at this time is given by n } times 
the sum of these elements. It was shown previously (Leslie, 1945, §4) that for values of 
tszm — k, where x = k is the last age group within which reproduction occurs, the last m—k 
columns of M* will consist only of zero elements, an expression of the obvious fact that 
individuals alive in the post-reproductive age groups contribute nothing to the population 
after they themselves are dead. From the point of view of the contributions made to the 
future population by the individual age groups, it is the submatrix A 1 which is principally 
of interest. When t becomes very large, A 1 can be taken as being proportional to the matrix 

H-^H = H-blr&E = 

and therefore the sums of the elements in the columns of A 1 must be proportional to the row 
vector rj v Since a population with an arbitrary age distribution tends ultimately to approach 
the stable form, provided that the system of age-specific fertility and mortality rates remains 
constant, it follows that the normalized row vector associated with the dominant latent root 
provides a measure of the relative contributions per head made to the stable population in 
the future by the individual age groups. Thus, supposing we have two arbitrary age dis- 
tributions and both subject to the same constant system of age-specific rates, the ratio 
between the total number of individuals in the two populations would, as time went on, 


tend to the figure 


R = 


Vd* 
Vi &y 


If, instead of regarding £ x and E, y as two separate populations, we regard them as two com- 
ponents of an age distribution £*, it is thus possible to estimate their relative contributions 
to the population in the future, subject to the condition that the system of rates represented 
by the matrix A remains constant. 

If, in this expression for R, we put £ y — £ x , the normalized stable vector associated with 
the dominant root of the matrix, we may write 


v = vi U, 


or, since the angle between two vectors i x and of lengths x and y respectively, is 

0080 = ?^, V — x cos 0_, 
yx *’ 

where Q x is the angle £ x makes with the stable vector of unit length. Thus, when £ x is the 
stable form of age distribution ( = c x £j), the quantity V is the same as the length of the 
vector £ x , since cos 0 X = 1 , and when the population is not distributed as to age in the stable 
form, 0< V <x. The rate of increase of F with regard to time i&dVjdt = rF, since F(<) = AiF(O). 

This quantity F appears to be essentially the same as that termed the total reproductive 
value of a population by Fisher (1930, p. 27). In discussing the equation 

e-+ x l x m x dx = 1, 

by means of which the inherent rate of increase r is usually calculated, Fisher points out the 
dose analogy between a population increasing geometrically and the growth of capital 
invested at compound interest. Thus the birth of a child can be regarded as the loaning to 
him of a life and the birth of his offspring as a subsequent repayment of the debt. Then, 
‘a unit investment has an expectation of a return l x m x dx in the time interval dx, and the 
present value of this repayment, if r is the rate of interest, is e~ rx l x m x dx; consequently the 



P. H. Leslie 


221 


Malthusian parameter of population increase is the rate of interest at whioh the present 
value of births of offspring to be expected is equal to unity at the date of birth of their parent \ 
(In this quotation the original symbolism has been changed to that used here; Fisher writes 
m, the Malthusian parameter, instead of r, and the maternal frequency b x instead of m x .) 
Fisher then goes on to say that ‘ we may ask, not only about the newly bom, but about persons 
of any chosen age, what is the present value of their future offspring; and if the present value 
is calculated at that rate determined as before, the question has a definite meaning — To 
what extent will persons of this age, on the average, contribute to this ancestry of future 
generations!’ He then defines the reproductive value which can be assigned to a person 

aged x as e rx /•» 

v x -~\ e~ H l t m t dt. 
lx Jx 

Thus, by assigning to each of the n x persons aged x the appropriate value v x and summing 
over all age classes of a given age distribution, a figure which Fisher terms the total repro- 
ductive value of the population may be obtained. He also pointed out that this total repro- 
ductive value would increase or decrease according to the correct Malthusian rate r. 

It was not difficult to show on an actual numerical example that the values of v x were the 
same, apart from a scale factor, as the elements of the ij l row vector after allowing for the 
fact that the latter refer to a population considered in discrete age groups, whereas the former 
refer to values of x whioh vary continuously; and it was evident that the calculation of the 
quantity V defined above was essentially the same as the calculation of Fisher’s total 
reproductive value of the population. 

There is, however, one important point in regard to the argument developed by Fisher 
which has been quoted. The present value of the repayment l x m x dx is taken to be e~ jrx l x m x dx, 
where r is the rate of interest. But, in the case of a population, this estimate of the present 
value would only be valid if the whole population were increasing at a rate r, and this would 
only be true when the stable form of age distribution was established. In other words, the 
reproductive value v x assigned to a female aged x is the present value of her future daughters 
only when that female and her daughters are considered as members of a population with 
a stable age distribution. That this is so may be seen from a numerical example. Let us sup- 
pose we are given the age distribution 

£ 0 = {81, 21, 5, 1}, 

which is a stable £ appropriate to the dominant root = 3 of the numerical matrix A (1*1) 
defined in the introduction. In one unit of time the population will be A£ a = A l £ a and these 
individuals will be either survivors or descendants of the original population. Each individual 
alive in the latter will contribute on the average so many living individuals to the population 
at t = 1, and we wish to assess the present value of that contribution. Consider first of all 
the solitary female alive in the last age group. In one unit’s time this individual will be no 
longer alive, but she will have contributed F s — 18 living daughters to the population at 
that time. The present value of that contribution will therefore be FJA = 6, and this is the 
present value which may be attached to each individual alive in this age group of a stable 
population at any given time. Passing to the five individuals in the next younger age group, 
6 P t — 3 will be alive in the fourth age group at t- 1 , and each of these three will be valued 
then at 6 or a total of 18. They will also have contributed 5F S — 90 daughters. The present 
total value of the contribution made by these five individuals will be therefore 

(90+ 18)/3 = 36. 



222 


Matrices in population mathematics 

or 7*2 per head. In the same way the 21 individuals in the second age group will each be 
valued at 3*85714 and the 81 in the first age group at 1 each. These values whioh have been 
determined in this way may be written as the row vector 

Vt = [1,3-85714, 7-2,6], 

where an asterisk is attached to the symbol in order to distinguish this vector from the true 
normalized form for this particular matrix, namely, 

Vl = j&S) [0 '*’ 1>28571 ’ 2 4 > 2 3 « 

and it will be noted that ijf = 3 . ^/(68) 

It is clear from this example that this method of assessing the present value of the con- 
tribution made by each female aged x to x + 1 to the population at time t + 1 is equivalen t to 
determining the present value of her future daughters, and that the valuation can only be 
carried out in this way when that female and her daughters are considered as members of 
a stable age distribution. Symbolically the equation which defines the elements y x 
(x = 0, 1,2, ...,&) of the vector y*, and which is equivalent to that given by Fisher for v x in 
the continuous case, is * 

S ^ X+1) L X F X 

Vx = vF 5 ’ 

and by an obvious extension to the case of stable ‘age distributions’ consisting of complex 
or negative individuals, the stable y* representing the present value of the ‘contributions’ 
made by each individual could be calculated similarly for each distinct latent root A a of 
a given matrix A. Moreover, it is evident in each case y x = 0 for all values of x > k , the last 
age group in which reproduction occurs. 

The use of these row vectors in the form tj* has, however, certain disadvantages, more 
particularly when it is necessary to compare the total present values of two stable age 
distributions which are each subject to a different system of rates of death and reproduction. 
It will be seen from the above equation defining v x that if the maternal frequency is measured 
in terms of daughters, we must have in all cases v Q = 1, since 

I e^ rx l x m x dx « 1 and Z 0 = 1. 

Jo 

Similarly in the discrete case, the value of y 0 may be written, making use of the relationship 

(AAA • • • Px) — A&+i/A> > 

•^0 , PqP\ . PqPl%2 , , (PqPiPz Pfc-l) Pk 

y°== T +-^-+- r ,-+-+ a*h~ '~ • 

which must be equal to unity, since from the characteristic equation of the matrix 

\M-F 0 ^-P 0 F 1 \^-...-(P 0 P 1 ...P k .t)F k . 1 \-(P 0 P 1 ...P k _ 1 )F^ = «. 

Thus, as exemplified in the numerical illustration given above, the vector y* will always have 
its first element equal to unity and will in general differ from the normalized ^ by some scalar 
factor. The vector y* measures the total value of a stable population on a different scale, or 
in a different system of units, to those in which the present value is measured by the vector 
7 ] v But the question of the respective units in which a number of such values are expressed 
might become of importance if two or more stable populations subject to different systems 
of rates were being compared. Suppose these rates are represented by a number of different 



P. H. Leslie 


223 


matrices A v A t , ...,A n , which will be assumed to be all of the same order. If the series of 
reproductive values for the individual age groups is taken as the row vector iff appropriate 
to each of the given matrices, the first element of each vector will necessarily be unity as has 
been shown above. That the use of these vectors in this form for calculating the total present 
value may lead to unsatisfactory results for the comparison between two stable populations, 
can be seen from a, simple example. Suppose each element of the numerical matrix A defined 
in the introduction (1*1), and which we will now call A v is divided by a factor of 3. The 
resulting matrix — say A s — can then be taken as representing a new system of rates which 
has a dominant latent root A x = 1. The stable age distribution £„ = {81, 21, 6, 1} of A x is, 
however, also a stable £ a of A 2 appropriate to this root. If the stable iff for the second matrix 
is calculated as before the elements will be the same as those given above for the original 
matrix. The total present value of the population represented by §„ would therefore be 
estimated at the same figure whichever of the two systems of rates it was subject to. If then 
these were two separate populations with rates A 1 and A 2 , which happened to have identical 
age distributions, a comparison between them by means of the total values calculated in 
this way is not very informative. The easiest way out of this difficulty would be to use only 
the normalized if 1 associated with the dominant latent root of each matrix in calculating the 
total present value of a stable population for the purpose of comparing it with that of 
another. This procedure allows for any difference in what may be termed the respective 
scales of the two matrices. For this particular example, the normalized ?f 1 associated with 
the root A, = 1 of the matrix A t is 

’ 2i = 7pr [0 ' 3, i - 2857i > 2 ' 4 - 2 ] 

= 0-19245i/ u , 


where the initial of the two suffixes refers to the matrix with which the vector is associated. 
The total value of a population with an age distribution £ a would therefore be 8-2462 if it 
was subject to the system of rates represented by the matrix A u and 1-6870 when subject 
to A t . Thus the use of the normalized row vectors instead of the form iff leads to a different 
value being placed on each of the two populations corresponding to a difference in the 
systems of rates to which they are respectively exposed. 

We may conclude, therefore, that in calculating the total value of a stable population it 
will in general be preferable to use the normalized stable row vector if x and not the form iff. 
The one form, however, can be readily transformed into the other. For, working in terms of 
age distributions confined to the prereproductive and reproductive age groups, if .the 
elements of iff are calculated by means of the above equation for y x , the relationship between 
V f and Vx is given by _ WCA)}"* 

Vi — Vm> ••• i dJ{ \ 


Vi> 


where df(X)/dA is the characteristic equation of the matrix differentiated with respect to A, 
in which the numerical value of the dominant root is inserted and the square root taken 
with a positive sign. Thus, for the numerical example which has been used previously in 
this section, the characteristic equation of the matrix A defined by (1-1) is 

/(A) = A 4 - 5A 2 - 10A - 6, 

= 4A 3 — 10A— 10. 

<lA 


and 



224 


For A x * 3, we have 


Matrices in population mathematics 


fffi-y-VW, 


and, since P 0 P l P t — J for this matrix, 


; V*> 


1,1 3^(68) ‘ 

corresponding to the difference between these two veotors which was noted above. Although 
this procedure has been illustrated in terms of the dominant root of the matrix, it can be 
similarly carried out for any stable tj* appropriate to a latent root A„. Alternatively, the 
normalized row vectors may be readily calculated in terms of the canonical population and 
the matrix B — HAH -1 by the methods described in the .previous paper (Leslie, 1945, §§7 
and 8), and transformed back again by means of the relationship y = <f>H. 

If Fisher’s total reproductive value of a population is written in terms of veotors as the 
flCalar oos 0 K , 

it follows, as was pointed out earlier in this section, that when the population represented 
by the vector £ x is of the stable form of age distribution, we have V = x, the length of £ x . 
The total reproductive value, or the total present value, of a stable population is therefore 
given by the length of the vector representing the age distribution of the population. Now 
any population of individuals with a stable form of age distribution £ a can be represented 
as a multiple c x £ t of the normalized stable £ associated with the dominant root of the matrix, 
and its associated vector i) a as a multiple c 1 of the normalized rf v the square of the length of 
£ a being given by i) a Ha- We may thus regard the vector i\ a = c x which is associated with 
the vector £ a - c x £ v as the representative of the population in terms of the individual present 
values according to age, just as the vector £ a is the representative of the population in terms 
of numbers according to age. Although we have been here considering only the total present 
value of a population of real positive individuals distributed as to age in the stable form, 
which must necessarily involve only one of the stable tj or £ for a given matrix, there is little 
difficulty from the mathematical point of view in considering ‘populations’ consisting of 
negative or complex individuals, and we may extend the arguments used for the real case 
so as to include all the stable vectors for the matrix. Thus, the length of any stable vector, 
£ a say, whioh fulfils the condition A£ a = \ a £ a , can be regarded as the total present value of 
the ‘population’ represented in terms of numbers by £ a and in terms of individual present 
values by its associated vector i\ a . 

Since any arbitrary age distribution of real individuals £ x can be regarded as the sum of 
one or more mutually orthogonal stable £, viz. 

= "i" + ••• + Ck+iUk+i’ . 

and its associated vector tj x similarly as the sum of a number of associated stable rj 

Vx ~ + + • • • + c-k+iVk+v , 

and since the total present value of each of the component stable vectors is given by the 
length of that vector, namely *J(c a c a ), the total present value of the resultant £ x will be given 
by V(2 c a c a ) , which is the length of the vector £ x . 

The row vectors which were originally introduced into this theoretical discussion solely 
for mathematical reasons are thus not entirely without interest from the biological point of 
view. The uniquely determined vector ij x which was assumed to be associated with each £ x 
is a measure of the present value of the contribution made to future generations by an 



P. H. Leslie 


225 


individual aged x to x+1 when that individual is considered as a member of a population 
with an age distribution £ x . The row vectors appear to form a more generalized system of 
weights or values which we attach to an individual aged * to x + 1 than the reproductive 
values v x defined by Fisher. The latter are represented by a single member of this class of 
vectors, though one of particular importance owing to its association with the do mina nt 
root of the matrix. 

Finally there is one further row vector which is very easily calculated for a given system of 
age-specific fertility and mortality rates, and which on occasion may be useful in studying 
the comparative fertility of different populations. The net reproduction rate, 



in addition to its usual meaning, may also be defined as the expected number of daughters 
which will be bom on the average by a female now aged 0 during the remainder of her life- 
time. It is in fact a figure which is analogous to the expectation of life at birth, only in terms 
of future daughters. Now, in addition to the newly bom, we may also enquire what this 
expected number of daughters will be in the case of a female alive at any age x. Clearly this 

figure is given by, j r<*> 

= J x m x dx, 

with u 0 a* R 0 . Similarly, in the discrete case, we may consider an r/ row vector of which the 
elements z_ (x = 0, 1, 2, ..., k) are 1 * 

X 

and it will be found that this is merely a multiple of the ij* veotor appropriate to the dominant 
root A x = 1 of the matrix for a stationary population which is obtained by dividing each of 
the F x figures in the first row of the matrix A by the net reproduction rate. 

4. The total reproductive value of a population and the length of a vector 
It appears from the foregoing discussion that the elements of the normalized row vector 
can be regarded from two slightly different points of view. On the one hand they provide 
a measure of the relative contributions per head made by each age group to the stable popula- 
tion in the future, and this property arises from the fact that the sums of the columns of the 
matrix A* can be taken as proportional to the elements of this vector when t becomes very 
large. On the other hand this vector is also associated with the column vector £, x representing 
the stable age distribution appropriate to a given matrix, and in this sense its elements are 
a measure of the present value of the contribution made to future generations by an in- 
dividual aged x to x + 1 when that individual is considered as a member of a population with 
a stable age distribution. This difference is of importance in making any practical use of 
Fisher’s total reproductive value of a population, which is defined here as V = where 
£ x is an arbitrary age distribution. 

Thus, if we have two populations £ x and £ y both of which are subject to the same system of 
rates A, or alternatively if £ x and £ y are two subdivisions of one population subject to A, 
we can calculate for each the total reproductive values V x and V v , and determine the ratio 
R — VJV U . This quantity, as was shown at the beginning of the previous section, is the ratio 
at time t, when t becomes very great, of the total number of individuals in the two populations 
which at t = 0 had the age distributions £ x and But the quantity R cannot be interpreted 
in this way when the two populations are not subject to the same system of rates. 



226 Matrices m population mathematics 

Again, if a population happens to have a stable form of age distribution £ a , then 
V = Vx£a ~ a, the length of the vector £ a and this figure represents the total present value 
of the stable population £ a . But, apart from the case when an arbitrary §* is of the stable 
form, it is difficult to define the meaning of V simply by itself in any precise biological terms. 
From the mathematical point of view, when an arbitrary is expanded in terms of the 
stable i, and £, = <*£,+*&+ ... +c k+ J M , 

k + 1 

we have r/ x g T = £c„c a , 

a-=l 

which is the same thing as x 2 , the square of the length of the vector £ x . Then it can be seen that 
since V = Vi£x — c i ~ #cos d x , the calculation of Fisher’s total reproductive value is essen- 
tially the determination of one component of a set of mutually orthogonal sums of squares 
which together make up the total sum of squares represented by x 2 . Thus V 2 = c\ which is 
the first term in ?i x £ x = 2 £ a c a , since c x is necessarily a real positive number. 

a 

The two methods of valuation which have been mentioned here are the calculation of the 
length of the vector £ x representing the age distribution of the population, and the calcula- 
tion of the total reproductive value V. Which of these two figures is the more important 
from the point of view of assessing the state of a population subject to a given system of 
fertility and mortality rates is a matter for discussion and further investigation. Certainly 
the total reproductive value V is a figure which is the more easily determined. It requires 
only a knowledge of the row vector 7 / 1 associated with the dominant root of the matrix 
representing the given system of rates to which the population is subject. On the other hand 
the calculation of the length of the vector £ x , is much more complicated. For, in order to 
arrive at the associated vector r) x = £ X HGH> it is necessary to know the numerical values 
of the elements of the matrix G> and hence HQH , which in turn cannot be computed unless 
all the latent roots of the matrix A are known. Thus, purely from the practical point of view, 
the calculation of the total reproductive value V = Vi^x offers a number of advantages and, 
within the limitations set out above, this figure may prove useful in comparing one popula- 
tion with another. 

It is perhaps worth mentioning in passing one further type of problem. If the length of 
the vector ^ is regarded as the present value of the population when it is subject to a particular 
system of fertility and mortality rates, it may be of interest on occasion to consider the 
maximum or minimum of the quadratic form QHGHE, given one or more restrictive con- 
ditions. Thus, for example, we might consider the problem of determining the column vector 

which would give rise to the minimum total value when the sum of its elements was equal 
to a number N . If n x (x = 0, 1, 2, ..., k) are the elements of £, and the symbol {1} represents 
a column vector of (fc-f 1) units, we have, after differentiating with respect to the n x and 
introducing a Lagrange multiplier A, 

HGHg e - A{1} = 0, 

Sn* - N, 

a set of ( k -f 2) equations for determining the values of n x which will make the length of the 
vector £, 8 a minimum subject to the restrictive condition imposed. It will be seen from these 
equations that the solution of this problem is equivalent to that of determining the column 
vector £, 8 which will have all the elements of its associated row vector ij B the same value. 
Thus, by reversing the process, and starting with an arbitrary row vector of (k 4- 1) units, it 



P. H. Leslie 


227 


follows that the required column vector is proportional to the sums of the columns of the 
matrix H~ x O~ x H~ x . As an example of the type of vector which has the minimum value, the 
solution of these equations in the case of the simple 4x4 matrix given in the introduction 
was for N ® 108, = {84-6686, 17-4064, 4-6069, 1-6201},' 

whereas the stable population of 108 individuals was 


£, = {81,21,5, 1}. 

This problem has been considered here in terms of the vector of shortest length, without 
imposing the full restrictive conditions which strictly speaking would be necessary when 
considering a population of living individuals, namely that the elements n x of the column 
vector are positive integers with hn x = N. But the vector £ g in this example consists of 
positive elements and may be taken as representing, in the case of this numerical system, 
the type of proportionate age distribution which would give rise to the minimum value. 
Actually the difference between the two distributions £ g and £, is not very marked in this 
example. The square of the length of the stable vector is 68, while that of the vector of 
shortest length is 64-4. But that this difference between the total values does correspond to 
a difference between the properties of the two age distributions may be seen by operating 
on each of them with the matrix A and determining the total number of individuals in the 
two populations at successive intervals of time. The numbers in the population which starts 
with an age distribution £„ will always be lower than those in the population starting with the 
stable form until ultimately there would be about 5*3 % fewer individuals in the former 
than in the latter. 


5. The limited type of population growth 


Hitherto it has been assumed that the system of age-specific fertility and mortality rates 
represented by the matrix A remains constant, and that therefore the population increases 
geometrically to an unlimited extent at a rate dN/dt = rN, when the stable age distribution 
is established. The next case which is usually considered in population mathematics is that 
of the logistic population, where the rate of increase in numbers is defined by the differential 
equation 


dt 


= (r — aN)N y 


r and a being constants > 0, from which the well-known result follows that such a population 
wall approach asymptotically an upper limit to the numbers given by K = r/a, according 
to the equation g 

N = \ + Cer H ’ 

It is therefore of interest to consider in terms of matrices and vectors the type of population 
growth in which the system of rates is dependent on the number of individuals present in 
the population at a given time. 

Suppose that the system of rates to which a population is exposed when no limitations 
are placed upon the growth in numbers is represented by the matrix A with a dominant 
latent root A,. This might be called the optimum system of rates for the particular species 
or genetic stock. When the population is increasing in a limited environment let us suppose 
that at time t there is an age distribution £(t) consisting of a total number N (t) of individuals, 
and that at this time the elements of A are altered so that we have a new matrix A, with a 
dominant latent root A x (q{t), where q(t) is dependent on N(t). Then the age distribution of 



228 


Matrices in population mathematics 

the population at time t+1 will be given by + 1 ), and the process can be obviously 

extended so that at time t + 1 we have a matrix A t+1 with a dominant root A Jq(t + 1), q(t + 1) 
depending on N(t+ 1), and so on. At each integral value of t, therefore, the original inherent 
rate of increase r = loggAj will in general change to a new rate r' — log^Ai/j), where q is 
some function of N, the number of individuals present in the population. 

The changes which are thus assumed to occur in the optimum age-specific rates of fertility 
and mortality represented by the matrix A might take place in an innumerable variety of 
different ways. But, from the theoretical point of view, there are two extreme oases which 
are particularly of interest; on the one hand, when the decrease in the optimum rate of 
increase is due to a lowered degree of fertility, while the age-specific death-rates remain the 
same: and on the other when it is due to an increased rate of Mortality and fertility remains 
constant. Even under these simplified conditions it is necessary to make some assumption 
as to the way in which the rates are actually affected, and in order to define the problem in 
concrete terms, it will be assumed here that the changes which occur either in the degree of 
fertility or in that of mortality are due to the operation of a factor which is independent of 
age. In addition one further type of change in the rates of fertility, involving a factor which 
increases geometrically with age, will be mentioned in passing. For simplicity the two main 
cases will be considered separately. 


(a) Mortality affected by a factor independent of age , fertility remaining constant 

If l x and m x are respectively the life table and fertility table for a population living under 
optimum conditions where no limitations are placed upon the growth in numbers, the 
inherent rate of increase (r) of the population is defined by 

ffco 

I e~ rx l x m x dx — 1, 

Jo 

and the stable age distribution (c x ) and the stable birth-rate (6) by 


be^ x L 


l _ C° 
b~Jo 


e- Jrx l x dx. 


If now a force of mortality (y ) which is independent of age is superimposed on the original 
force of mortality (fi x ), represented by the optimum life table l x , the new life table l' x will 
be given by i d v 




or l' x 


-r x L 


and, if the original fertility table remains unaltered, the new inherent rate of increase will 
be r’ = r — y. The stable age distribution (c x ) and stable birth-rate (6') of the population when 
it is subject to this new life table will then be 

4 = b'e^'%, l 




and it follows, since l x = e-y*l x and r' = r— y, that 1/6' = 1/6 and c' x = c x . The imposition 
of a force of mortality independent of age on a given life table thus leaves the original stable 
age distribution and stable birth-rate unchanged. 

Similarly in terms of matrices, if A is the matrix representing the age-specific rates of 
fertility and mortality for a population living under optimum conditions, we are led to 
consider the matrix q~ l A in which each element of the original matrix A is divided by a 
scalar q. Approximately, in the discrete case, this is equivalent to imposing on the original 



P. H. Leslie - 229 

life table a force of mortality which is independent of age. Then in the reduction of q~ l A to 
rational canonical form, if 



(PiP2-P k - 

.... . 


• , 

• •• 


* 

(flk-i)r 1 • 

1. 


the first row of B q = H q (q~ 1 A)H ~ 1 is 

F 0 q~\ P 0 F iq ~ ■, P^q-*, .... (P 0 P X P, ... P^) F k q~<*+» 

the remaining elements consisting in the usual way of a series of units in the principal 
subdiagonal. 

The characteristic equation of the matrix B q is 

A* f 1 — F 0 q l A k — PqF x g ” 2 A fc ~ 1 (i® A •• • P k - 2 ) F k _ x q~ k A — (^o A • • • i*_i) F k q^ M) = 0 

while that of the original matrix B = HAH 1 is obtained by putting q = 1. Comparing these 
two equations term by term it will be seen that the latent roots of B q are merely those of B 
each divided by the factor q. Thus, in terms of the canonical population, the stable age 
distribution appropriate to the dominant latent root AJq of the matrix B q may be taken as 
a multiple of the vector 



and since £ x = H q x }/r v 

ii = {(Po^A-.P^Af, (P 1 P 2 ...P^ 1 ) 1 Af-», (P*.,)- 1 A lf 1}, 

which is the same as the ^ = H~ x \Jr x appropriate to the root A x of the original matrix A . 
Moreover, since the time which it takes for an arbitrary £ x to approach the stable form of 
age distribution associated with the dominant root of the matrix will depend on the ratios 
of this root to the other roots of the matrix, as may be seen from the expansion of £ x (t) at 
time t in terms of the stable £, 


£r(0 Cj 1 *+■ + • • • 4- 

it follows that a population with any arbitrary form of age distribution which is subject to 
the matrix q~ x A will approach the stable form at the same rate for all values of q . This 
result is of interest in the theoretical study of wild mammalian populations, since it might 
be assumed, at least as a first approximation, that any increase of mortality due to pre- 
dation, hunger, etc., falling on some optimum system of age-specific death-rates could be 
represented by a factor which tended to be independent of age. 

If then we consider at time t a population with an age distribution £(t) which is subject to 
the system of rates represented by the matrix q~ x A, and we regard q as some function of N y 
the number of individuals present in the population at time t , we might put as a first approxi- 

mation q = a + pN. 


For the stationary state we must have q = A x , the dominant latent root of the matrix A, 

Biometrika 35 15 



230 


Matrices in population mathematics 

and in addition, as N tends to zero, q must approach 1. When the dominant latent root of 
q~*A is equal to unity, the condition for a stationary population, 

N - = K, 

P 

and therefore we may write q = 1 + — — • 


Then, assuming at time t there are N (t) individuals distributed as to age in the stable form of 
distribution (£ a ) for the matrix q~ l A , which distribution is the same for all values of q as has . 
been shown above, h £ (t) 

9 ' 


or 


g -Mg o (0 = D = 

N(t+ 1) = 




and 


K-N(t-\- 1) _ , _j 1A — N(t) 

w+T) 1 1 N(t) r 


which, as log e A x = r, is the same thing as the logistic type of population growth, 

N - — — 

1 + Ce~ rt 


Thus, when fertility remains constant and mortality is affected by a factor which is in- 
dependent of age, this factor being regarded as a simple linear function of the numbers 
present in the population at time t, the total number of individuals in the population will 
increase according to the logistic form of population growth, provided that the age dis- 
tribution of the population at t = 0 is the stable form appropriate to the dominant latent 
root of the matrix A . But, when this condition is not fulfilled, and the initial age distribution 
is not of the stable form, there may be quite considerable departures from the curve given by 
this actual logistic equation. The form of the curves representing the total number of 
individuals at successive intervals of time will, however, still tend to be S-shaped, and in some 
cases there is little doubt that a logistic type of equation could be fitted empirically to the 
data over a considerable portion of the total curve. The type of variation which might be 
expected in these growth curves owing to a departure from the stable form of age distribution 
is illustrated in the following simple examples. 

Suppose that an entirely imaginary population, which can be considered in four age groups, 
is subject to the optimum system of rates of death and reproduction represented by the 
matrix defined originally in the introduction, 

0 6-4286 18 18‘ 

0-7778 0 0 0 

0 0-7143 0 0 ’ 

0 0 0-6000 0. 

which has a dominant root A x = 3, or r = 1-09861, and suppose that for the matrix q~ l A, 

q = 1 +0-000185186#. 

where N is the number of individuals in the population at integral values of time t. When 




P. H. Leslie 231 

? = 3, the stationary state, N = 10800; and at t — 0 let there be 108 individuals present in 
the population. These conditions are f ulfill ed by the logistic equation 

10800 

^ ~ l + OOe-i- 0 * 861 *' 

If at t - 0 we consider three different age distributions each consisting of 108 individuals 
and represented by the vectors 

£, = {81,21,5,1}, £ a = {85, 17, 4, 2}, £* = {0,0,108,0}, 

where £ a is a stable age distribution of the matrix A, £ a the vector of shortest length given 
in the previous section and expressed to the nearest integer, and £ x a very skew form of age 
distribution in which all the individuals are concentrated in an age class for which fertility 
is high, the age distributions and therefore the total number in each population can be 
readily calculated by successive applications of the matrix q x A. The following are the results 
obtained in each case, together with the values of N calculated from the logistic equation 


Values of N 


t 

From 

logistic 

Initial age distribution 

L 

L 

L 

0 

1080 

108 

108 

108 

I 

317-0 

318 

292 

1970 

2 

900-0 

901 

844 

1930 

3 ; 

2314-3 

2310 

2215 

0199 

4 

4800-1 

4802 

4000 

8423 

5 

7073-7 

7075 

7540 

9389 

0 

9508-7 

9509 

9433 

10094 

7 

10332-3 

10332 

10298 

10009 

8 

10039-5 

10041 

10028 

10741 

9 

10745-9 

10745 

10742 

10804 

10 

10781-9 

10781 

10780 

. 10781 


which are given in the first column. It will be seen that in the case of the stable age distribu- 
tion £ n the values of N follow those calculated from the logistic equation, apart from small 
discrepancies at times in the last figure due to errors of rounding off. (The elements of the 
vector £(t+ 1) = q~ l A£(t) were in each case expressed to the nearest whole number.) In the 
case of £ 8 , an age distribution which does not differ very greatly from the stable form, the 
numbers lie below those for the initial distribution £ a until t = 10, the stable age distribution 
being approximately established in this population round about t = 7 ; while for £ x the num- 
bers are very erratic owing to the very skew form of the initial distribution leading to a very 
rapid increase in numbers during the early stages. The stable form of age distribution was 
approximately established in this last population at f = 1 0. It is evident from these examples 
that the initial form of age distribution may have a marked effect on the course of develop- 
ment followed by a population which inherently is increasing towards some upper limit 
according to the type of growth in numbers assumed here. 

The initial number of individuals in these three examples is small relative to the upper 
limit of K = 10800, so that even a thoroughly skew form of distribution such as £ x has time 

15-2 



232 


Matrices in population mathematics 

in which to approach the stable form of age distribution before the upper limit in numbers 
is achieved. Actually in the case of £ x the stable form is not established before t = 10, and 
a tendency to overshoot the upper limit will be noticed before that time. If the initial number 
of individuals had been chosen much greater relative to K this tendency would only have 
been emphasized. An extreme case would have been to assume that the initial number of 
individuals in each of the three examples was equal to 10800. Then it is evident that whereas 
the population represented by the stable form, would have remained constant at the 
same figure, those represented by £, and £ x would vary on either side of the upper limit to 
begin with and would tend to approach the steady state by a series of damped oscillations 
as the stable age distribution was in the process of being established. 


(b) Fertility affected by a factor independent of age, mortality remaining constant 
This problem raises a number of difficulties not all of which have been satisfactorily 
resolved. But, before considering the main problem as defined here, namely when fertility 
is affeoted by a factor independent of age, there is another case which arises from the fore- 
going discussion, and which is perhaps worth mentioning. The canonical matrix 

B q = Hgiq'A) H~ l 

defined above, is when written in full, to take a simple example of a 4 x 4 matrix, 

'F 0 q~' P 0 F ig -* PMq- 3 P 0 P 1 P i F 3 q~*~ 

10 0 0 

D _ 

® 0 1 0 0 
0 0 1 0 


and it can be seen that in addition to being the canonical form of q~*A, B q is also the 
canonical form of 


'F 0 q~ l F iq ~* F t q~ 3 F s q ~ «' 

. P 0 0 0 0 

0 P x 0 0 

0 0 P t 0 


the diagonal matrix H of the transformation HA q H~ l having elements h n = P 0 P l P 2i 
h >22 = Pi<P 2 > ^33 = ^44 = 1* This matrix A q can be regarded as representing some system 

of age-specific rates in which an original level of fertility included in the F x figures has been 
affected by a factor which increases geometrically with age, and as before this factor q might 
be taken as being linearly related to N , the number of individuals present in the population 
at time t . But in contradistinction to the matrix q~ l A, the age distribution of a population 
subject to the matrix A q will no longer remain stable. For suppose that in terms of the 
canonical population the stable age distribution associated with Ijhe dominant root A 1 /r/ 

k-1 


of A q is 


HW (*)■ 


the transformation gives 

£i = {(PoA^ - .... i}, 

which is not the same as the associated with the dominant root A x of the original matrix A . 



P. H. Leslie 


233 


If then at time t a population happened to have the stable form of age' distribution appro- 
priate to the matrix A q (t), it will not in general have the stable form of distribution at t+ 1 
appropriate to A q (t + 1), except in the case of the stationary population with N = K and 
AJq — 1, when the life table age distribution is established. 

By extending this argument for the matrix A q to the perfectly general case, it can be seen 
that the age structure of a population will be constantly changing when the degree of fer- 
tility is affected and the life table remains constant, until in the terminal stages of its growth 
the population approaches the stationary state. This is, of course, essentially the same type 
of changing age distribution as that shown to occur by Lotka (1931) in the case of a popula- 
tion growing in numbers according to the logistic law with a constant form of life table. 
A numerical example is given later of a population subject to the matrix A q when q is taken 
as a simple linear function of N. 

Although, biologically speaking, it is not impossible for fertility to be affected by a factor 
which increases geometrically with age and which depends on the number of individuals 
•present in the population at a given time, it is perhaps of greater interest to consider the 
case in which the fractional decrease in fertility is the same at all ages. In other words, it is 
necessary to consider the matrix A a , say, in which the telements’in the first row of a matrix 
A representing the optimum rates of death and reproduction are each divided by a factor s, 

F 0 s~' F lS ~' F^- 1 ... F k 

Po 

p i • ••• 

• • • P/c-l 

Now, if the (a; + 1 )th element in the first row of the canonical form B = HAH -1 is written as 

(P q P 1 P z . . . P x -iF x ) — f x , 

the characteristic equation of the original matrix A is 

= 0 , 

and that of the matrix A s is A fc+1 — 1 £ fx = 0. 

If the real positive root of the first equation is A x , the real positive root of the second can 
be written as XJq, and the inherent rate of increase of a population subject to the system of 
rates A e will be r' = log,. (A x /g). Since we are considering as before the case when q is a function 
of N, say , . 

q=l+\±N, 

it is necessary, in order to solve the problem of a population in which fertility is affected by 
a factor independent of age, that s should be expressed as a function of q. 

This point proved to be rather troublesome, and the following solution needs a much fuller 
investigation than it has received here. It depends on the relation between the first row of 
the canonical form B = HAH~ 1 and the L x th x column which was touched on in the previous 
paper (Leslie, 1945, § 6). It is evident that the division of the elements in the first row of the 





234 Matrices in population mathematics 

matrix A or B by a scalar a is the same thing as dividing the maternal frequency figures 

/•co 

(m x ) by the same quantity. The original net reproduction rate, JRq = J l x m x dx, will therefore 
beoome R^ja. Now, in the solution of the equation 


/; 


er rx l x m x dx = 1, 


we have 


where 


1 Wo q Wi n WI>J 

log,J? 0 “ myr-^rt + ^r 3 - - 4 




.... 


(5-1) 


and m n (n = 2, 3, 4, ..., n) is the nth moment about this mean. When the maternal frequency 
is divided by s the moments of the distribution will not be affected, but the value of r will 
change to a new value r', and 


8„'S m *~ 3 m l 


log e(Rol°) = ™ir'-'$r'*+~*r’*- 


4! 


r . 


(5-2) 


as a 


The moments are usually calculated by treating the L x m x figures ^L x = J l r dxj 

frequency distribution, the individual frequencies being regarded as centered at the mid- 
point of each age group. Alternatively they are sometimes calculated from l x m x , where l x is 
the value of the usual life table function taken at each midpoint. When a system of rates is 
expressed in the form of a matrix the elements of the first row of the canonical form 
B as HAH~ X are not the same as the L x m x figures. But it was found (Leslie, 1945, § (>) that 
the sum df these elements was equal to the net reproduction rate and that if each element 
(P 0 Pj P 2 . . . P x _ t F x ) was regarded as centered at the age of x + 1 , the mean and semiin variants 
of the distribution were the same as those obtained from the L x m x column. 

These relationships suggested a possible way of relating s to q. If for the matrix A , with 
a dominant latent root A x = e r , the sum of the elements in the first row of B = HAH 1 is 
equal to i? 0 , and if the dominant latent root of the matrix A a is A Jq = e r \ we might, as a 
first approximation, take only the first terms in each of the equations (5* 1 ) and (5-2), and put 

log* R o = m \ r > lo 8e ( R ol s ) = m i r ’> 


and 


or, since 


-log e^O = log„(#o/s). 


= 1 - 


log,? 


log,* * — “log,?. 


(5-3) 


For a greater degree of accuracy the first two terms could be taken as (5-1) and (5*2), viz. 


log,i? 0 = 


log ,(^o/«) = m i r '~ y r ' 2 ; 



log,i?o = log, (RJ*)- 


and 



P. H. Leslie 


235 


From whioh, putting log e g = w and ~~~ r ~ c > 

log t 8 = - ^ «r — c)w — w 1 }. (5-4) 

TC 

For a greater degree of accuracy still, further terms on the right-hand side of (5*1) and (6*2) 
could be included, though the algebra tends to become somewhat tedious. Presumably the 
number of terms which it would be necessary to include in any particular case would depend 
on the magnitude of r and upon the form of the distribution relating net fertility to age. 
Actually in the elementary numerical example which has been used here so far, equation 
(5*4) appears to be fairly accurate. Thus the characteristic equation of A with A t » 3 and 
Ro - 21 is A 4 — 5A a — 10A — 6 = 0. 

Dividing these numerical coefficients by 8 = 5, for example, 

A 4 — A 2 — 2A — 1*2 * o, 

of which the real positive root is A Jq = 1*63476, or q — 1*83613. The values of m t and m 2 
were 3*04762 and 0*62164 respectively, and equation (5*4) was in common logarithms 

logs - 2*48366 log q + 0*60263 (log q) 2 . (5*6) 

For q « 1-836, the estimated value of 8 is 4*975, whereas the true value is 8 = 5. If 8 is 
estimated from equation (5*3) for q = 1*835 the value is 5*379, so that the second degree 
equation in log q is an improvement on the first and gives a reasonably close approximation 
to 8 for values lying in this region. It will be noted that if </ = 3, s = 21 from this second 
degree equation (5*5), as it should do. 

In order to compare the operational effect of the matrix A e with that already determined 
for q~ x A, two examples are given below for the initial age distributions 

£.-{81,21,5,1}, £.-{0,0,108,0}, 

£ a being the stable age distribution of 108 individuals for the matrix A , and the same form 
of skew distribution used previously. As before, q was taken as 

q = 1 -f 0-000185185JV, 

and the appropriate value of 8 at each stage was calculated by means of equation (5*5). In 
addition one example is given of the operation of the matrix A q in which fertility is affected 
by a factor which increases geometrically with age, taking £ a as the initial distribution. The 
results were as follows: Values of N 


t 

From 

logistic 

Matrix A, 

L 

Matrix A, 

L 

L 

0 

108*0 

108 

108 

108 

1 

317*6 

312 

312 

1915 

2 

900*0 

867 

867 

1976 

3 

2314*3 

2118 

2116 

5603 

4 

4860*1 

4194 

4120 

7315 

6 

7673*7 

6669 

6393 

8616 

6 

9608*7 

8867 

8362 

10464 

7 

10332*3 

10268 

9696 

10369 

8 

10639*6 

10876 

10384 

10695 

9 

10746*9 

10984 

10673 

10901 

10 

10781*9 

10900 

10766 

10715 




236 


Matrices in population mathematics 

Comparing the two oases in whioh the initial distribution was of the stable form £ a with the 
figures derived from the logistio curve, it will be seen that in both oases the numbers of 
individuals are less than those for the logistic particularly in the early stages of development. 
Broadly speaking, however, all these three curves are similar in their general outlines, though 
there is an obvious tendency in the case of the matrix A q for the population to overshoot the 
upper limit of N — 10800 in the later stages. Similarly, in the case of the initial distribution 
£ x and the matrix A t , the course of events is not very different from that for the previous 
example with this distribution, when it was assumed that mortality was changing and 
fertility remained constant, though, again here, the numbers of individuals are less when 
fertility is changing and mortality remains the same. The chief difference between these 
examples and those given previously lies, of course, in the forms of the age distribution. 
When the matrix q~ x A was assumed to be in operation, the ultimate age distribution to 
which all populations would tend, whatever their initial conditions and numbers might be, 

waB £ = {8100, 2100, 500, 100}; 

whereas, both for the matrix A q and A s , the stationary age distribution of 1 0800 individuals is 

£ = {4050, 3150, 2250, 1350}; 

and throughout the whole course of development of each population an approach is being 
made to one or other of these very different distributions. 

Although the two extreme cases of either fertility or mortality changing through the 
operation of a factor which is independent of age have been considered here separately, 
there should be little difficulty in extending the methods so as to include the case where both 
fertility and mortality are affected in varying degrees at the same time. Thus, we might 
consider the scalar g of the dominant latent root AJq at time t as being the product, q = uv, 
of two factors, one of which, u say, represents an increase in mortality independent of age, 
and the other v represents the effect of a decrease in fertility at all ages by means of the 
factor 8. Various possibilities then arise, depending on whether the ratio u/v was regarded 
as a constant, or as varying in some predetermined manner. However, these questions have 
not been gone into any further at present. 

It will be noticed that the problem considered in this section of a growing population 
subject to a changing degree of fertility and a constant life table is not precisely the same as 
that discussed by Lotka (1931). In the first part of that paper Lotka showed how the birth- 
rate, death-rate, age distribution and inherent rate of increase of such a population would 
change when the total number of individuals in the population increased according to the 
logistic law. Here no assumption is made as to the way in which the number of individuals 
is increasing, but it is assumed that at equal intervals of time, which intervals in practice 
can be made as small as we please according to the degree of accuracy required, the inherent 
rate of increase of the population r' = log e (\jq) is dependent on the number of individuals 
(N) present at time t, and, as a first approximation, q has been taken as a linear function of N. 
The most important feature of this form of population growth is the marked effect which the 
initial age distribution and numbers have on the subsequent course of development of the 
population. Only in one case, namely when mortality is increased owing to the operation 
of a factor independent of age, fertility remaining constant, and when the initial age dis- 
tribution is of the stable form appropriate to the matrix A , is the true logistic form of growth 
in numbers realized. However, the result of operating on a not too abnormal initial dis- 
tribution with either of the matrices q~ l A, A q or A„ is, broadly speaking, a very similar type 



P. H. Leslie 


237 


of S-Bhaped curve, if the initial numbers are small relative to the upper limit K, and in some 
cases there is little doubt that a logistic equation could be fitted empirically to such a series 
of points, more particularly when the figures for the total number of individuals are not 
available over the complete range of development of the population. But, in general, we 
shall have for a given matrix A and a given value of £ in the equation q= 1 + (A x — l)N/K, 
a family of S-shaped or partially S-shaped curves (or even the type of curve which descends 
towards the upper limit K), the differences between the individual members depending on 
the initial state of the population and on the way in which the decrease in the inherent rate 
of increase takes place, whether through a decrease in fertility, or an inorease.in mortality, 
or a combination in varying degrees of both factors. Among the more interesting features 
of this type of population growth is the possibility, under suitable initial conditions, 
of the total numbers in the population becoming greater than K and then of finally 
approaching the stationary state by means of a series of damped oscillations around 
this limit. 

It is interesting to consider in the light of these results some of the population growth curves 
which have been published for one or other species of insect living alone in a limited environ- 
ment (e.g. Chapman, 1928; Crombie, 1945). Certainly the initial age distribution of some of 
these populations must have been extremely skew, consisting as they did in many cases of 
only a small number, perhaps only a pair, of adults. It is a little difficult, on looking through 
the figures given in these various papers, to rid oneself of the impression that some of the 
curves may have been influenced, in part at least, by these rather extreme initial conditions. 
But at present this remains an impression and nothing more; it does suggest, however, that 
the part played by the initial age distribution is worth investigating further in these experi- 
mental populations. 

Although the dominant latent root of the matrix operating between t and t + 1 has been 
considered here only as a function of the number of individuals present at time t, there should 
be little difficulty in extending the argument so as to include the case when q is assumed to 

be a function not of N(t) but of N(t - a) where a is an integer, or even of an integral, f Ndt 

Jo 

say. This last would be equivalent to assuming that the growth of the population was defined 
by a type of integro-differential equation such as is introduced by Volterra in his development 
of population mathematics (e.g. Volterra, 1931, p. 141; Volterra & D’Ancona, 1935, p. 22). 
Moreover, there is another and more speculative approach which is not without interest. In 
all these various forms of population growth the inherent rate of increase is regarded as 
dependent on the total numbers and thus each individual is counted as being of the same 
value for all age distributions of which it is a member. In other words, the faotor q is taken 
to be some function of the scalar [1] £, where [1] is a row vector of units. Now, from the 
biological point of view, it is not unreasonable to suppose that the form of the age distribution 
may also be of importance. For a given value of N we might have two entirely different age 
distributions, one of which was composed largely of adult individuals and only a small 
number of young, and the other with these proportions reversed. The question naturally 
arises whether one is justified in assuming that both the populations are of equal value and 
that they both influence the system of rates to the same extent. The one with the larger 
proportion of adults might exert a greater degree of influence on the rate of increase owing, 
for instance, to a proportionately greater consumption of food, or an enhanced mutual 
interference between the individual members of the population. But this is at present purely 



238 


Matrices in population mathematics 

speculative, and so far as the writer is aware, there is no experimental evidence for the 
occurrence of such differential effects associated with the form of the age distribution when 
the populations are of the same size. As a possibility, however, it is of interest theoretically 
and it suggests that instead of oounting all individuals as equal, some system of weighting the 
individual age classes would be required. A mathematical model which immediately comes 
to mind is that of a matrix whose dominant latent root is affected by the length of the vector 
on which it is operating; that is to say, it would be assumed that the inherent rate of inorease 
was -dependent on the present value of the population at a given time. 


6. Thb predator-prey relationship between two populations 

It is of interest to consider very briefly a simple type of predator-prey relationship between 
two species of which the one, S v is preyed upon by the other, S 2 . If the matrix A x with a 
dominant latent root A x represents the optimum system of rates for the prey and the matrix 
A lt for this population at time t has a dominant root AJq v we might regard the factor q x as 
a function of N t , the number of the predatory species S 2 , and write as a first approximation, 

</i = 1 + a, N t , (6-1) 

where a, > 0 is a constant. In the same way there will be some optimum system of rates A 2 
for the species S 2 , though in fact this system may never be realized in full save under excep- 
tional circumstances, for instance when the prey are extremely numerous in comparison 
with the predator, and everything in the environment is favourable to the latter species. 
(From the biological point of view there must be some upper limit to the possible inherent 
rate of increase of which a particular species is capable. For instance, in the case of mammals, 
this limit will be determined in part by physiological factors, such as the length of the 
gestation period, the shortest interval between litters, the maximum average number of 
daughters per litter, the age at which breeding first starts, and so forth, as well as the form 
of life table under the most favourable circumstances.) Then at time t the matrix A. u 
will have a dominant root A a /V/ a and we will write 

<fo=l+a a ~|, (6-2) 


where a 2 > 0 is another constant and N x the number of the species S t at time t. This equation 
expresses in a simple fashion the main biological consequences to the species >S' a of its depend- 
ence upon 8 X as a source of food. For when N x -> 0, q 2 -> oo, and the inherent rate of increase 
of the predator r 2 = log e (A 2 /g 2 ) -> — oo (disappearance of predator in the absence of any 
prey). Conversely, when N x becomes very large, q 2 ->l and the inherent rate of increase of 
the predator approaches its optimum value r 2 — log e A 2 . 

Adopting, then, the simple system represented by (6-1) and (6-2) we shall have for the 
stationary state, putting q x = A x and q 2 - A 2 , 


xr _ ~ \ ) 

^- ai (A 2 -l) 


= *i. 


N, 


A x -1 


= K t , 


which will be real positive quantities when both A x and A a > 1. Moreover, assuming for the 
moment that a stable stationary state is possible, we must have a 2 ( Ax — 1) >a 1 (A 2 - 1) and 



P. H. Leslie 


239 


(A x — l)>a x for both species to coexist in appreciable numbers. Then, expressing a x and ce* 

in terms of the A’s and K'b. n„ 

l + (Ai-l)^, (6-la) 

?a = 1 + (A 8 -1)M. (0-2o) 


K * Nl ' 

This simple system, however, can be improved upon to some extent. It will be noticed that 
if in equation (6-1) N t = 0, q t = 1 and thus in the absence of the predator it is assumed that 
the prey will increase to an unlimited extent. In order to introduce the conception of a 
limited environment, we might put 

?i = 1 + (0*0) 

so that when N t — 0, the species *S\ will approach some upper limit in numbers. A slightly 
more general system is represented then by equations (0-3) and (0-2), for which the stationary 
Btate is - ... N (A x — l)(A a — 1) 

a i(^2 — 1 ) + 2 ®i(A a — 1 ) + 


Nx 


It would thus be possible to examine the consequences of various hypotheses as to the 
way in which the reduction in the optimum inherent rates of increase for the two species are 
effected. The possible combinations are, however, so numerous that it is difficult to cover at 
all adequately any more than one of the most obvious cases. In order to illustrate the pro- 
perties of such a system, the simplest, and also the possibly not unrealistic example of the 
reduction in the rates for both species taking place through the operation of an additional 
force of mortality independent of age will be considered here. That is to say, it will be assumed 
that the effect of the species S 2 on system of rates for the species S x will be to divide the 
elements of the matrix A , by the factor q v and similarly that the effect of the speoies 8 X on the 
species S 2 and the matrix A 2 will be to divide the elements of the latter by q 2 . This simplifies 
a number of the actual computations and also the analysis of the properties of the equations. 

If at time t the age distributions of the Ny(t) and N 2 (t) individuals of the species S 2 and S 2 
are of the stable forms appropriate to the dominant latent roots A x and A a of the matrices 
A x and A 2 respectively, then from the properties of a matrix q~ x A which were discussed in 
the previous section, the two populations will retain their initial forms of age distribution 
unchanged. The total numbers of individuals in the two populations, supposing these are 
subject to the system defined by equations (6-la) and (6-2a) respectively, will therefore be 
at time t + 1 .. A 1 iV 1 (f) 


^(<+1) = 


l + (Ai— 1) 


m’ 


N 2 (t+ l)=r 




l + (A a -l) 


K x m’ 


whence 


and 


A 1 (<+l)-AT l (0 = 


N 2 (t+l)-N 2 (t) 


K 2 N 2 (t) 

(Ax-lJ^ojl-^j 


1 + (A X -1) 


m 

AT 




l + (A a -l) 


mo/ 

KyN 2 (t) • 
K 2 N x (t) 



240 


Matrices in population mathematics 


Before discussing the limits to which these difference equations will tend when the time 
interval is made smaller and smaller, it is neoessary to consider the question of the value to 
which the dominant latent root of the matrix will tend when the latter becomes of a very 
large order. Suppose that working in some convenient unit of age and time we have the 
matrix A v with a real positive root A x , representing some given system of age-specific 
fertility and mortality rates. We can also construct a new matrix — say — for the same 
system of rates when the time interval is taken to be a half-unit. This new matrix will be 
twice the order of the original one and it will have a dominant root — A t say — which will be 
less than A x . Continuing the process further, we shall have for an interval of age and time h 
a matrix A h with a dominant root A h , this root representing in the case of a population with 
a stable age distribution, the ratio N(t + h)/N(t ). In order to compare the successive values 
of A a which would be obtained by making the interval h smaller and smaller, it is necessary 
to express them in some common unit of time and we can write 

A = (\ h ) i,h or A a = A A . 

Then, when the matrix remains constant in time, we shall have for a population with a 
stable age distribution, 


rl rt ft 


or, when h-> 0, 


dN 

dt 


= (log* A )N, 


since 


lim 


A* — 1 


i— - iog - A - 


Thus, as the matrix is made larger and larger, the value of log e A tends to p, the true instan- 
taneous relative rate of increase of the stable population per unit of time. 

In a similar fashion we may write for an interval h the above difference equations in the 

form (W-¥ 

h 


N,(t + h)-N s (t) 

h 


i + (A?-i)M 


1 + (Ag — 1) 


K 2 N 1 (t) 


which, as h-+ 0, may be replaced by 


dN, 

dt 




Thus, when the age distributions of the populations 8 X and S 2 are each initially of the appro- 
priate stable form, and when it is assumed that their respective systems of rates are repre- 
sented by the matrices qi 1 A 1 and qt x A % , the system of interrelations between the two 
populations which is defined by 

N 

= 1 + a i-ZVg> 



P. H. Leslie 


241 


is equivalent to that defined by the differential equations 

. (6-4) 

or, when ftis defined by (6-3), to 

^ ~ 2 = (r 2 -a 4 ^)iVr a , (6-5) 

where in both sets r x = log e A x , r 2 = log e A 2 and o x , o 2 , 6 X are constants > 0. This result is 
analogous to that discussed in the previous section for a single population increasing in a 
limited environment, where it was shown that when mortality was affected by a factor 
independent of age and the initial distribution was of the appropriate stable form, the 
numbers of individuals increased according to the logistic law, and that consequently under 
these conditions the type of population growth resulting from the operation of the matrix 

q~'A, where A.-l mr 

q — lH — N , 

was equivalent to that defined by the differential equation, 

The system of equations (6-4) differs somewhat from the classical Lotka-Volterra equations 
(Lotka, 1925, Chap. 8; Volterra, 1931, p. 14) for a simple predator-prey relationship between 
two species, in which the second member would be written 

dK r v 1T 

dT ~ ““ r 2 + a 2^l)^2* 

The form of the second member in (6-4) was originally suggested by the results of an analysis 
made by the author (unpublished observations) of some data given by Gause (1934) for the 
growth in numbers of Paramecium caudatum and Paramecium aurelia cultures, in which the 
food supply consisted of a suspension of Bacillus pyocyaneus in a buffered medium. Two 
different concentrations of bacteria — called by Gause ‘one loop’ and ‘half-loop* — were 
used for both species of Paramecium , and under the conditions of the experiments these 
populations could be regarded as living in a limited environment with a constant supply of 
food. It was apparent from the results that for each species living alone the upper limit to 
the number of individuals depended on the concentration of food, being in each case approxi- 
mately twice as great in the cultures with the ‘one loop* concentration as in those with the 
‘half-loop*. If logistic equations are fitted to the four series of data given by Gause (1934, 
table 4, p. 145), it will be found that whereas the constant r in the equation dN / dt = (r — aN) N 
remains approximately the same in the pair of experiments on each species of Paramecium , 
the constant a is inversely proportional to the concentration of food (see also on this point 
Kostitzin, 1937, p. 77). Thus, when the food supply (F) was kept constant, the form of 
population growth in numbers could be written 



where C represents the relative concentration of food in the different experiments. This 
relationship suggested a system of equations such as (6*4) for the theoretical case of a food 
supply consisting of a population of individuals which when living alone would increase at 



242 


Matrices in population mathematics 

a rate dNJdt = r t N v However, apart from these considerations, the form of the second 
member of (0*4) is linked with the type of expression used here to define q 2 in terms of 
and N t , and the latter arose as one of the simplest and most obvious ways of expressing the 
dependence of the species S 2 on 8 V bearing in mind that the elements of the matrix repre- 
senting the system of rates at a given time must be positive quantities (F x ^0, 0<P X ^ 1). 
The difficulties which arise when this is not the case will be appreciated on endeavouring 
to find a working model in terms of matrices and veotors which will reduoe to the olassio 
Lotka-Volterra equations under suitable initial conditions. For, in the case of the predatory 
species S 2 we should have to oonsider a reciprocal matrix A 2 l with a real positive root Aj" 1 , 
and at time t the matrix q^A.% 1 would be regarded as operating on the vector &{t) representing 
the age distribution of S 2 . Then, if as before the matrix qt x A x represents the system of rates 
for the species S t and 

<h =1+ ^ N » + 

we have a system which will reduce to the Lotka-Volterra differential equations when the 
initial age distributions of both populations are of the stable form appropriate to their 
respective matrices A x and A % . Now, apart from the fact that here no upper limit is placed 
on the inherent rate of increase, r 2 = log f {q 2 /A a)> °f the species S 2 > there is an added complica- 
tion that a number of the elements of A 2 l will be negative (for the form of the matrix A 1 
see the previous paper, § 4). Although no difficulties arise in the special case, when the age 
distribution of & a is of the stable form, in the perfectly general case of an arbitrary £(1) some 
of the elements of £(t+ 1) = can become negative and thus meaningless from the 

biological point of view. For these various reasons, therefore, the form of interrelationship 
between the two species defined by equations (6* 1 a) and (6-2a ) was adopted here as a working 
model, and these reduce in the special case to the system of differential equations (6-4). 

The writer has to confess that he has been unable to integrate either of the sots ((>*4) and 
(0*5). Their main properties, however, seem to be quite clear. Taking the simplest system 
(6-4) first, we have for dN x jdt = dN 2 /dt = 0, 


*i - - K v 


To a 


2^1 


A 2 = £ = if 2 , 

a i 


and, introducing for simplicity the variables n x — N X /K X1 n 2 = N 2 /K 2 , 

dni 

dt 


- r ini (l-n t ), ~ r a n *( 1- ^)- 


We will suppose that we are dealing with the case when r x a 2 >r 2 a v r x >a v in order that 
the stationary state may have a real meaning from the biological point of view. Then, in 
considering small departures from the stationary state, let v x = n x — 1, v 2 ~n 2 —\\ and, 
disregarding in the usual way terms such as v x v 2 , vf, etc.; 



-~ r i v 2> 


dv 2 

~dt 


= r 2 v x -r, 


2 


This linear system will have a solution of the type v x = A x eM + B x e^ t v 2 = 
where the values of fi will be given by the roots of the characteristic determinant 


-ft ~ r i 
r a -(r a +/t)| 


0 


2 ft m -r a ±V(r|-4r 1 r 2 ). 


or 



P. H. Leslie 


243 


Thus, both roots and /i t will be complex so long as r, < 4 r v and the real part of this pair 
will be negative since r t > 0. The system under these conditions will therefore approach the 
stationary state by a series of damped oscillations. When r a > 4r v both /t, and /t s will be 
negative; and consequently the stationary state will be stable, since in both cases and v t 
tend to zero as time increases. 

The analysis of the system represented by equations (6-5) leads to very similar results. 

Fot „ 

dt ~ dt ’ 


N t = 


r,a, 


1“2 


= K lt N, 


r, r. 


l'a 


K* 


And, in the same way as before, putting n x = N x )K V n 2 = NJK 2 , v x = n x — 1, v 2 — n 2 — 1, 
and neglecting terms in v x v 2 , etc., we have 


dv x 


dt 


1 « —r l (l — k)v l — r 1 kv %i 


dv g 

dt 


® r 2 v x -r 9 v, 




where lc = ( 1 4- \ 

\ r 2 a \1 

Then, putting the characteristic determinant 


(0<k<\). 


-faO -*) + /*} -*i* =() 

r 2 

we have /< 2 4- {r 2 4- r x ( 1 — &)} // -f r 2 A- = 0. 

The roots of this equation will be either both negative or both complex with the real 
part negative, depending on the relation between the various constants, and consequently 
both v x and v 2 will tend to zero as time goes on, the stationary state thus being stable as 
before. It will be noticed, however, that if fi = u ± iv } the damping term represented by the 
real part, u ~ — {r a + r t ( 1 — A")}, will be greater than in the case of the first system of equations 
((5*4) where u = — r 2 . Again, for a given set of values of r v r a , a v a 2 , the number of individuals 
N x = A\, N 2 = K z must be less for the second set of equations than for the first, since by 
definition b x > 0. Thus we might expect that for a population subject to equations (6-5) the 
stationary numbers will be lower and the approach to the stationary state more rapid than 
for a population subject to equations (6*4), provided that the values of r v r 2> a x and are 
the same in both cases. 

As a numerical example of these predator-prey equations, suppose that the optimum 
system of rates for two imaginary species were the same and that they were represented by 
the matrix A which haR been used previously to illustrate various points. Then A x = A 2 , 
A x = A a « 3 and r x = r a = 1*09861. If, for the first set of equations (6*1) and (6*2) we put 

q x = 1 4- 0*002 N 2i (6*6) 

q 2 =\ + H)N 2 IN x (6*7) 

and for the second, (6*3) and (6*2) 


q x = 1 4-0*002^4-0*000185185^, (6*8) 

q 2 remaining as before, the number of individuals for the stationary state are in the first case 
K x = 5000, K 2 = 1000, and in the second K x = 3418, K 2 « 684. Then, assuming that at 
t = 0, N x =s N 2 = 108, and that each of these populations had the same stable form of age 
distribution == (81, 21, 5, 1}, 



244 Matrices in population mathematics 

the results of operating on these two age distributions with the matrioes qT 1 A 1 and q<[ l A t 
were as follows for the two sets of equations. The first two columns give the numbers of 
prey (^) and predators (N t ) when no upper limit is placed on the number of prey (equations 
(6*6) and (6-7)), and the second two columns give the respective numbers when the upper 
limit to would be 10800 individuals, if the predatory species was absent (equations (6*8) 
and (6*7)). (This is the same logistic population as was used in the previous section as an 
illustration.) 


I II 


t 



N x 


0 

108 

108 

108 

108 

1 

200 

30 

202 

30 

2 

755 

42 

710 

42 

3 

2089 

81 

1754 

79 

4 

5393 

175 

3550 

103 

5 

11983 

390 

5371 

335 

0 

20050 

894 

0040 

019 

7 

21583 

1854 

5403 

917 

8 

13750 

2991 

4227 

1020 

9 

5910 

2820 

3317 

897 

10 

2005 

1400 

2921 

720 

11 

2033 

077 

2928 

025 

12 

2592 

409 

3140 

598 

13 

4012 

501 

3390 

018 

14 

0011 

008 

3555 

058 

15 

7717 

949 

3587 

092 

10 

7980 

1277 

3529 

709 

17 

0741 

1474 



18 

5122 

1388 



19 

4071 

1122 



20 

3703 

890 



21 

4043 

795 



22 

4084 

804 




In both cases the approach to the stationary state by means of a series of damped oscilla- 
tions is very evident, this approach being made more rapidly in the second series than in the 
first as was to be expected from the results of the foregoing analysis. Probably the clearest 
graphical illustration of these functions is obtained by plotting log N 2 against log , the 
result being a spiral curve which gradually approaches the stationary point. 

Although these predator-prey equations have been studied here only in the special case 
of the reduction in the rates of increase of the two populations being effected by an increase 
in the degree of mortality which is independent of age, there would be little difficulty in 
investigating, for instance, the type of case in which a relative absence of prey affected the 
fertility of the predator, and so forth. Moreover, there will be in all cases the effect on such 
a system of any abnormalities in the initial age distributions, or of any chance disturbances 
of the existing age distributions at some point in the development of the populations. 
Without working out any actual examples, however, it might be expected from the results 
obtained in the case of a logistic-type population that the general effect of all these factors 
would be to add further oscillatory features to those which already are inherent in the system 
itself, evein when the stability of the age-distributions is established as in the above numerical 
examples. It seems likely, too, that these additional factors will increase the chance of one 
or other of the two species being reduced to such low numbers as would be equivalent in 
practice to the extinction of the population. This possibility will, however, greatly depend 
on the numerical relations between the various constants which enter into the equations 



P. H. Leslie 


246 


and upon the initial conditions of the particular system. Finally, just as in the case of a 
solitary population increasing in a limited environment, there is the possibility of studying 
the more complicated cases in which q x and q a are taken to be functions not only of N x and 
N a at time t, but of the numbers at some previous time, or of an integral of or N a between 
some time limits. Similar methods could also be used in order to study a chain of such predator- 
prey relations. 

This work arose out of some research carried out by the Bureau of Animal Population 
with the aid of a grant from the Agricultural Research Council, to which body grateful 
acknowledgement is made. 


REFERENCES 

Chapman, R. N. (1928). The quantitative analysis of environmental factors. Ecology, 9, 111-22. 
Cboubus, A. C. (1945). On competition between different species of graminivorous insects. Proc. Roy 
Soc. B, 132, 362-95. 

Fishbb, R. A. (1930). The QeneticcH Theory of Natural Selection. Oxford: Clarendon Press. 

Gad be, G. F. (1934). The Struggle for Existence. Baltimore: Williams and Wilkins. 

Kostitzin, V. A. (1937). Biologic, rnathimatiquc. Paris: Armand Coir ( 

Leslie, P. H. (1945). On the use of matrices in certain ->pul<~ " ) athematics. Biomeirika, 33, 

183-212. \ 

Lewis, E. G. (1942). On the generation and growth of a population Sankhya, 6, 93-0. 

Lotka, A. J. (1925). Elements of Physical Biology . Baltimore: Williams and Wilkins. 

Lotka, A. J. (1931). The structure of a growing population. Hum . BioL 3, 459-93. 

Lotka, A. J. (1939). Th6orie analytique des associations biologiques. II. Analyse d6mographique avec 
application particulidre h l’espdce humaine. Actuality Sci. no. 780, 1-149. Paris: Hermann. 
Volterha, V. (1931). Lemons sur la thdorie matMmatique de la liUte pour la vie. Paris: Gauthier-Villars. 
V olterra, V. & D ’Ancona, U. (1935). Les associations biologiques au point de vue math^matique. 
ActualiUe Sci . no. 243, 1-90. Paris: Hermann. 


1 Blometrlka 35 


x6 



[ 246 ] 


THE TRANSFORMATION OF POISSON, BINOMIAL AND 
NEGATIVE-BINOMIAL DATA 

By F. J. ANSCOMBE, Rothamsted Experimental Station 

1. Introduction 

Bartlett (1036) showed that if r is a Poisson variable with mean m and y is a random variable 
whose values y are derived by the transformation 

y-V' (M> 

f om the values r of r, then y is distributed rather more nearly normally than r with variance 
pproximately J if m is large, and the technique of analysis of variance may be applied to y.* 
le also showed that y _ s j[r + £) (1*2) 

■ s a better transformation, if slightly less convenient to use, as y then has more nearly a 
constant variance of J, even when m is not large. Similar transformations were proposed 
for a binomial variable, ar °U 11942) gave the transformation analogous to (1*1) appro- 
priate to a negative binon jib. 

I begin by considering the transformation 

y = V( r + c ) G'3) 

of a Poisson variable r, and show that for large m y has a most nearly constant variance 
(namely, I) when c = f ; a result due to A. H. L. Johnson. 

The similar transformation for a binomial variable r, with mean m and total number n, is 

’-~jm (, ' 4) 

The optimum value of c is f if m and n — m are large. The variance is approximately \(n + J ) -1 . 

For a negative binomial variable r, with mean m and exponent k, the latter being constant 
and known, the corresponding transformation is 

j-ainh C 

The optimum value of c is roughly f if m is large and k> 2, and the variance is approximately 
\\jr'(k), where \jr'(t) denotes the second derivative of lh T(<) with respect to t. A simpler 
transformation, known to have an optimum property (i.e. to be the best of that degree of 
complexity) for m large and k > 1, is 

y = ln(r + P); (1-6) 

the variance is approximately ijr'(k). This is equivalent to setting c = \k in (1*6). If k is large, 
\jr'(k) * l/(& — !) approximately. ( 

The effect of these transformations for small values of m is shown numerically. The value 
f for c appears to be nearly optimum for practical purposes, except with the negative bi- 
nomial distribution when k is small. (When k = 2, the optimum value of c appears to be 
about 0-2.) In any case, it may be more convenient to ohoose a one-decimal value for c, 
i.e. generally 0-3 or 0-4. 


• Letters in heavy type denote random variables, of which the same letter in light type denotes 
a possible value. Some equations, in particular all those of $ 1, are equally valid for y and r in either type 



247 


,-^he ‘angular transformation’ given by 
*ie observed percentage as 100(r + c)/(n + 2c), 
it'e variance. For transformation (1*5), no corre- 
^ublished, I believe*. 

I 

Poisson distribution 


. a Poisson distribution with mean m, and we consider the 
with c a non-negative constant. 

. 1 - c = m'. Coefficients a B are defined for s = 1, 2, 3, ... by 


«„= (~ 1) 4+1 


1 • ( — 1) • ( — 3) ( — 2s + 3) 

2 s . s! 


( 2 - 1 ) 


Then for any t > — m' we have the Taylor series expansion 

y - >'{> ^ + ■••+<- 1 •'“-.(s)") + 

If t > 0, we see at once from Lagrange’s form of the remainder term that R„ satisfies 

Considering now 1 1 1 ^ ra', we have directly from (2*2) 

The series converges, and we may write 


t« 




00 / / \i— * 


( 2 - 2 ) 


(2-3) 


(2-4) 


(2-5) 


where the right-hand side again converges and is bounded. If 0(s) is a bound to its absolute 


Comparing tliis inequality with (2-3), we see that it holds for all t> —m\ 
We note now that the moments of t are 


H x = 0, fc 2 = w, /i z = m, // 4 = 3m 2 -f m, etc., (2*7) 

and the absolute moment of order n is 0(m* n ) as m -> oo. We may therefore take expectations 
formally in the right-hand side of (2*2) and its powers, and derive asymptotic expansions 
for the moments of y as ra->oo. We find 



, . If, 3- 8c 32c z — 52c+ 171 

(2-8) 


,8rW ~l| 1+ 8» + 32m® )• 

so that when c = f 

Var(!,) ~i{ 1 + 16m*)- 

(2-9) 

We have also 

. ,, . 1 24c— 7 

E(y)~V(« +c) 8m ,+ 128ml . 

(2-10) 

If we set 

E(y) = V(w v +c), 

(2-11) 


* [Beall (1042, pp. 250--51) gave a table of x 1 = sinh^kc)*, suited for the form of trans- 
formation which he used. Ed.] 


16-3 



248 Transformation of Poissor, 

m v is the estimate of m derived by appl, 
metio mean y of a large sample of observer 




1 

4 + 32 /»; 


so that setting c = f also renders the bias m„ — m in m v 
Skewness and kurtosis of the distribution of y are meai: 


7i~ 


_1 
2m* I 


+ - 


25 -48c 


]■ 


1 f, 

y 2 ~- i + 

m 


^ 1 16m 

945— 1536cl 
256m /’ 


\ 

(2* 14:) 


These compare with m _ * and m~ l for the original Poisson variable r. 

It is also of interest to find the large-sample efficiency E y of the arithmetic mean y of 
observed values of y as a statistic for determining m (Bartlett, 1936, 1947). If x is a random 
variable having a distribution (absolutely continuous or discrete) such that the arithmetic 
mean # is a sufficient statistic for determining a parameter 0, it is easy to show, using the 
form of the frequency function of x given by Fisher (1934), that the large-sample efficiency 
E y of the average y of any function y of x, for determining 0, is the square of the correlation 
coefficient between z and y } i.e. 

F - [cov (x, y)] a 

v var(x). var(y)* ^ ^ 


In the present instance, 


e = [° ov (^ y )] 2 x 

v mvar(y) ~ 8 rri 64 m 2 


(2-16) 


3. Binomial distribution 

We suppose now that r is distributed in a binomial distribution with total number n and 
mean m (0 < m < n), and we consider the transformation 


(3-1) 

where c, d 1 , d 2 are constants to be determined. Setting 

r — m — t, m + c — m\ n + d l = n v n + d 2 — n 2 , 


the transformation becomes 


y = ^n s sin" 1 



(3-2) 


We can expand y in a Taylor series in ascending powers of t, for - m' a t < n x - m', and show 
that R t , the remainder after s terms, is such that 


t°yjn 2 


is bounded for all t in the range considered, with bounds depending on s and the ratio m'/n^ 
only. Thus 


(3*3) 



e 249 

ihe moments of y, valid for large n and 


# - 3 — 8c 3 + 8c — SdA 
p~ + 8m 8(n — m) ]’ 

/2 , we have 



(3-4) 


(3-5) 


. 1 / = 2c, so that the transformation is symmetrical about r = in. 

effects the scale of y (for n fixed), and not the constancy of var (y) as 
jue shape of the distribution of y. 

quations corresponding to (2* 12), (2-13), (2-14) and (2-16) are (proceeding to one 
term tewer in each case, for simplicity) 


2 m — n 

(3-6) 

m v~ m+ 4 » ’ 

2 m — n 

(3-7) 

^ 2 [nm(n — m)}* ’ 

n 2 — 2m(n - m) 

y ^ 

nm(n — m) 

(3-8) 

E j (2m — n) 2 

v 8 nm(n-~m)' 

(3-9) 


4. Negative binomial distribution 

We can deal similarly with a negative binomial variable r, with mean nTand exponent k 
(m, k > 0), such that the probability of observing a value r is 


v = r(r^)p^y/ mV-* 

Vr r\V(k)\m + kj\ + k) 1 ’ ’ ’ 


(4-1) 


The asymptotic expansions will be valid for large m and constant ratio k/m. We find the 
transformation 

(4*2) 


with 


» - V(* — i) «nh -> y(jr|) , 

Tar(y) _ i+0 (JL). 


(4-3) 


But it is of more interest to consider m large and k fixed. (The corresponding problem does 
not arise with the positive binomial, since m < n necessarily.) The preceding method of 
obtaining the expansions now breaks down, as it relied on the ratio of standard deviation 
to mean of r tending to zero as m-*-oo. Now we have the ratio tending to hr *. 

We consider two transformations, 


y — 2 sinh -1 



(4*4) 


and y = ln(r + A). (4*6) 

It is supposed that c,k+d, and A are positive and constant. Apart from an added constant, 
(4-4) may be written y = 2 ln{V(r+c) + V(r + c + k+d)}. (4-6) 



When r is large, we have (again ignoring 

t/~ln?°' 

where for (4-4) or (4*6) 

A = i(2c + k + d), B* = i{8c a n ? ' ; 
and for (4*5) B - A. 

We proceed to find an asymptotic expansion (as m -*■ ao,* - 


generating function of y, i.e. for 

M(t) = Ze vl P r - 

r-0 


We set 

m 

T = e > 

m + k 

(2-14) 

so that a->0 as m-*oo; and 

We first prove the following 

c W r ( r + *) e -r a _ M(a) 

C T(fc)r! e UrW ‘ 

(4*12) 

Lemma. .dsa->-0, 

00 Poo 

Yi u ri a )~ u r (a)dr 

r- 0 Jo 

(4-13) 


tends to a finite limit (depending on k and t, and cm which f unction yofr is chosen, namely (4-4) 
or (4-5)). 


Proof. By differentiating ln« r (a) with respect to r we see that as r->oo 


(4-14) 


where C depends on <+ k— 1 and p but not on r or a, and the expression 0(r <,,+1) ) is valid 
uniformly as ft- h^). Since for large r 


u r (ct) = 


yi+k ~\ £~tcl 


1 + 0 - 

r 


we see that 


m ( 

(^) %(«) = C'e- r «r < +* : - 1 -P^l + 0^ 


(4-15) 


(4-16) 


as r oo, uniformly as a -> 0, where C satisfies the same condition as before. Ifp>$ + &+ l, 


we may write 




(a) 


C 


(4-17) 


(r + 1 ) 2 ’ 

where C is independent of a, and this holds for all r ^ 0 because the left-hand side is bounded 
for r in any bounded interval. 

For all p, (d/dr) p u r (a) ->■ 0 as r -> oo, while at r = 0 (d/dr) p « r (a)is finite and tends to a finite 
limit as a->0. We therefore have the Euler-Maclaurin expansion 


where 


iu r (cc) = j\(a) dr + K(*) (|;)* \(d)+R t , 

*-Jo*>(s; u A a ) dr > 


(4-18) 

(4-19) 


and 0(r) is a bounded periodic function of r, depending on a only, k is given, and we may 
restrict t to a neighbourhood of zero, say 1 1 \ < e. We choose s to be an integer satisfying 
2 s>k + 1 +e. Then from (4-17) we see that R, tends to a finite limit as a->0, as do all the 
other terms after the first on the right-hand side of (4-18). Hence the lemma is proved. 



251 


|.(a) dr + 0(a k ). (4-20) 

;J itactly. We may, however, expand u r (a) for 
»ng shown in (4-15) above. The error of the expansion 
next term (independent of a) for r > 1. Integrating term 
a oo, we obtain after a tedious reduction the following 

ditions of (4-1), (4-7), (4-10) and (4-11), M(t) can be expanded 
. in the form 

x- V(k) [ l + {A ~^ k)t T+T^i 

+ {(HA - **)»+£*)<» + akA - &*(* + 3)- mt} ^ - Z T ^+ 7 -2) + „•] + 0{ak) - 

(4-21) 

The series in square brackets is continued as far as the term in a n , where n is the greatest integer 
less than k. t is supposed confined to a neighbourhood of zero . 

The cumulant-generating function is now found by taking the logarithm of M(t), namely, 


Hence 


K{t) = - /In a -fin r(fc + /)-lnr(fc)-f (A — \k)t r — - — - -f .... 

rC "f t — i 

/ , i r / 1 \ ^ 2 A 

var (y) ~<]/(k) + 


(4-22) 

(4-23) 


if k> 1. We use the notation ijr(t), i]r'(t), etc., for the successive derivatives of Inr(k). If 
A = | k, we have _ 


var (y) ~ rjr'(k) + 


k(k - 1 ) (* - 2) ~ ( 2k - 3 ) (5fc* - 3k - 1 2 B 2 ) 


(4-24) 


v v-/ ■ I2(k — l) 3 (k—2)* 

if A* > 2. Considering y defined by (4- 4), the conditional = ^k gives d = - 2c, and the coefficient 
of a 2 in var (y) vanishes if c takes a value dependent on k, which for large k is approximately 


3 23 1 

8 + 1924’ 


(4-25) 


and which rises to a little above 0-4 as k decreases towards 2. It appears from the numerical 
work below that a value of c somewhat under f is optimum for practical purposes (if k > 2). 
For y defined by (4-5), we set A = and then 

vat (y) ~ r (*) - ? ^t-2). »'• < 4 ' 26 > 

if k > 2. If A is large and m > i, we have a ~ kjm and 


(4-26) 


var (y) 




(4-27) 


Thus the larger k is the larger m must be for var (y) to approach its limiting value when 
m-*-o o. The transformation (4-5) is therefore not satisfactory if k. is large. 

For either form of transformation, we find the following limiting values as m-*-ao (a->0): 

m y jm = exp {f{k) - In k }, (4-28) 

7i = rmrm*> 

y 2 = ir m (k)l[r/r'(k)] a . (4-30) 



252 Transformation of Poisson, q 

The effect of the transformation on the s 
logarithmic transformation of a variance 
Bartlett & Kendall ( 1 946). We find also, by 
of y in estimating m, namely, E y = 

For a > 0, these expressions hold with error 0\a m ) v 
To conclude this investigation of asymptotic propex 
(4-6) when k = 1. We note first that if k < 2, and t is suita L 
(4*13) is also the limit as N -> oo of 

N rN+i 

S« r (°)- « r (0) dr. \ 

r-0 J0 

When k = 1, ti r (0) = (r + A)*, 

and the limit is easily evaluated as 

f(A,t) = l + \nU(2n)A*e~ A ir(A)}t + 0(t*). v , 

We also note that the derivative of (4*13) with respect to a is finite and tends to a finite limit 
as a-*0 (proof as for the lemma itself), so that the error in replacing (4*13) by its limit as 
a -*■ 0 is 0(a). We thus find 

M(t) =f(A,t)a + (l-e- x )j^(r + A) t e- ra dr+0(a t ). (4-36) 

The integral is an incomplete F -function and is easily expanded in powers of a. Hence 
var(y) = ^'(l)-(x4-|)a(lna) 2 

+ 2{(A-$)iJ/(l) + A—A\nA + \n(<J(2n)A A e~ A ir(A))}a\na + 0(a). (4-36) 

Setting A = £ (in agreement with our previous rule A = \k), and a = m \ we have 

* var(y) = — l n 2^+0^j. (4-37) 

5. Numerical investigation 

Table 1 refers to the transformations (1*3), (1*4) and (1*5), with c = $ in each case. The first 
section gives the error of the estimate m y of m got by applying the transformation in reverse 
to E (y). The second section gives the ratio of var (y ) to the 4 limiting variance \ The limiting 
variances are respectively \(n -f J^)" 1 and tne first and third of these being the actual 

limits as m oo, and the second the value suggested in § 3 for n large. The remaining sections 
of Table 1 give the shape constants y x and y 2 of the distribution of y and the efficiency E y > 
With the binomial distribution, no entries are given for m>\n, as the functions are 
obviously symmetric or skew-symmetric about m = \n. 

For the Poisson distribution, the variances may be compared with those given by Bartlett 
(1936) for c =* 0 and For the negative binomial, Table 2 gives var (y) when c = 0 and 
and 4 = 2; and also var(y) for the transformation (1*6), when 4 = 2 and 5, the limiting 
variance being now ^'(4). * 

Comparing (1*5) and (1-6), we have indicated above that (1-5) is more effective in stabilizing 
the variance if 4 > 2, and the computations confirm this. If c = f , (1*6) is defined for all 
4> and it is quite possible that throughout this range it is superior to (1*6). As 4-> J (or, 
more generally, as 4-> 2c) the two transformations become equivalent, and (1-6) is defined 
for all 4 > 0. Bearing in mind that (1*6) is more convenient to use than (1*6), it seems reason- 
able to recommend that (1*6) should be used if 4 < 2, and also for larger 4 if m is large; and 
otherwise (1-6). 



253 





254 Transformation of Poisson, bfo 

Table 2. Other transformation 


m 

Transformation (1* ! 
ife = 2 



c=0 

c = i 

L: 


Variance as fraction of the limiou 


1 

1*106 

0-503 

0*456 


2 

1*272 

0*697 

0*648 


3 

1*255 

0*793 

0*749 


4 

1*229 

0*849 

0*811 


6 

1*174 

0*908 

0*879 

0-b. 

10 

1*106 

0*954 

0*936 

0-928 J4) 

20 

1*045 

0-984 

0-976 

0-977 

00 

1*000 

1*000 

1*000 

1-000 j 


The transformation (1*3), with c = f , derived by the formal expansion given in §2, was 
communicated to me by Mr A. H. L. Johnson. He has kindly agreed to my publishing his 
result. I am indebted to Mr L. K. Turner and Dr A. A. Rayner for their patience in carrying 
out the laborious computations. 


REFERENCES 

Bartlett, M. S. (1936). The square root transformation in the analysis of variance. J, R. Statist , Soc , 
Suppl. 3, 68. 

Bartlett, M. S. (1947). The use of transformations. Biometrics , 3, 39. 

Bartlett, M. S. & Kendall, D. G. (1946). The statistical analysis of variance -heterogeneity and the 
logarithmic transformation. J, R. Statist, Soc, Suppl, 8, 128. 

Beall, G. (1942). The transformation of data from entomological field experiments so that the analysis 
of variance becomes applicable. Biometrika , 32, 243. 

Fisher, R. A. (1934). Two new properties of mathematical likelihood. Proc, Roy, Soc, A, 144, 286. 

Fisher, R. A. & Yates, F. (1938). Statistical Tables for Biological , Agricultural and Medical Research 
(3rd ed. 1948). Edinburgh: Oliver and Boyd. 






[261 ] 


SOME RESULTS IN THE TESTING OF SERIAL 
CORRELATION COEFFICIENTS 

By M. H. QUENOUILLE 

Anderson (1942) and Koopmans (1942) have recently investigated the distribution of the 
serial correlation coefficient defined by n 

2 x i x i+i 
i-i 
r i~ ~ 

2 A 

where the x t arc normally and independently distributed about zero, and x n+i is taken as x t . 
This definition approximates to the definition 

n— I 

2 x { x i+ , 


r l~ l/n-l n—l \» 

A £*£«-) 


which is more generally used, and using this form, Anderson was able to obtain an exact 
distribution. Anderson’s distribution was, however, difficult to use in practice, and Koop- 
mans obtained an integral approximation, which was adequate for n > 10. Rubin (1945) has 
solved this integral to give the distribution 

which is, in fact, the distribution of the ordinary correlation coefficient based on n 4-3 
observations. Madow (1945) has demonstrated how this distribution may be extended, and 
if the x x are connected by a linear Markoff scheme x M = px { 4- e i+l where the e i are normally 
and independently distributed about zero, it is not difficult to show that a good approxi- 
mation to the distribution of the first serial correlation coefficient is given by 

htr\dr ^+1) (l~r*)*<"-» 

h( )d r($n+$)r a) (i- 2 pr +p*)*»' ( ’ 

We oan, in a manner analogous to the method used with the ordinary correlation coefficient, 
make the transformation 

r = tanh z, p = tanh £ and x = z— £ in the form (2). 

Then hlx\ dx = + j_) ^ 

' ' r(Jro + 1) r(J)cosh£cosh n+1 x(l 4- tanh£ tanhar)(l — tanh a £ tanh a x) in " 

Now, for n large, x = 0(1 /Jn) and 

, M 


Then h(x) dx = 


mra+1) .. n 1 „/l\ 

r(*» + i)r(*)“* loge 27r + 4» + % 8 )’ 

. * a 2x* /1\ 

logecoshx = , 

loge (1 — p a tanh a x) = -p a , 


l 0ge (1 +ptanhx) «= , 


Blometrika 35 



Testing of serial correlation coefficients 


whence 


where 


2 cosh 2 g _ 1 

a n n(l— p 2 ) 1 

I" 1 x 2 ( 1 — p 2 ) nx*( 1 — p 2 ) ( 1 — 3p 2 ) p 2 i 

[in 2 + IsT + ~2 

1 5# 2 (1 — p 2 ) na^l — /> 2 )(1 — 3p 2 ) p 2 x 


f £} x ’ 


X — 1 -px + 






We can now obtain the moments of the distribution in the form 

.. e mwi „/i\ .. i 


^ ■i 5 (T 6 ~?. ! j , + 0 (i)’ 


°G)’ 


n(l—p 2 ) n 2 (l—p 2 ) 2 + \n 

, - 3 . 2 (! — 9 ^ a ) , pi 

' * ~ »»( 1 -p 2 ) 2 re 8 (l ~/9*) 8 \ 


and the coefficients of skewness and kurtosis are given by 

6p 8 ^/1\ 2(1 — 3p 2 ) ,.(l\ 

7l ~ M 1 - />*)]» + (re*/ ’ ~ re( 1 - p 2 ) + (re 8 ) ' 

Hence, under this transformation, z will be distributed approximately normally about the 

mean „ p p( 1 +p 2 ) 

• C “ re(l — p 2 ) + re 2 ( 1 - p 2 ) 2 ’ 

... • 1 2p 2 

with variance - 2 , 2 . 

n(\~p 2 ) n 2 (l—p 2 ) 2 

It should be noted that, compared with the transformation of the ordinary correlation 
coefficient, this transformation has the disadvantage that the variance of z is, to the first 
order, dependent upon the value of p . Furthermore, for \p | large, the mean value of z 
deviates more widely from £. 

The following examples are intended to investigate and demonstrate the testing of serial 
correlation coefficients when the error distribution of the e is not normal. 

In these examples, 1, 4 and 5 test the approximation by sampling artificial series with 
rectangular error distributions, 2 and 3 use an exceptional error distribution to show that the 
approximation is valid for n as low as twenty, while examples 6 and 7 demonstrate that the 
values of r obtained from certain observed series are consistent with homogeneity of the data. 

Example 1. Serial correlation coefficients were calculated (according to the circular 
definition) from twenty sets of twenty random numbers, rectangularly distributed from — 49 
to 49. 

The values of r thus obtained were: 


-0-348, 

-0-267, 

-0-211, 

-0-202, 

-0-200, 

-*•0-174, 

-0-137, 

-0-067, 

-0-068, 

-0-056, 

- 0-030, 

0-011, 

0*100, 

0-112, 

0-177, 

0-234, 

0-236, 

0-348, 

0-376, 

0-446. 



These values were transformed by the transformation r = tanh z so that z should be dis- 
tributed with zero mean, and variance 0-0500. The estimated mean and variance about 
zero were found to be 0-017 and 0-0530 respectively. The normality was, in addition, tested 
by the calculation of skewness and kurtosis coefficients 

g x = 0-595 ± 0-512, g 2 = - 0-751 ± 0-992. 



M. H. Qttenouille 263 

For small samples such as this, it must be borne in mind that the values of g x and g % do not 
demonstrate normality but only the absence of extreme non-normality. 

Example 2. Suppose we consider the distribution P(x { — 1) = P(x { — — 1) = so that 
E( x i x w) ~ 0- Then each product of successive observations will be distributed in the same 
manner as x {> and r, based on n such products, which are independent, is distributed binomi- 
ally (if a oircular definition is employed, then the same distribution of r, omitting alternate 
ordinates, is obtained). Thus we can compare the terms of a binomial of degree n with the 
ordinates of the distribution of the ordinary correlation coefficient based on n + 3 obser- 
vations. The values of these are given below: 

n = 5 


r 

Binomial 

terms 

Correlation 

coefficient 

ordinates 

Binomial 
terms -r 0-4 

1-0 


0-00 



0-150 

0-38 

0-39 

- 

0-312 

0-80 

0-78 


n = 

10 


r 

Binomial 

terms 

Correlation 

coefficient 

ordinates 

Binomial 
terms-- 0-2 

10 

0-001 

0-00 

0-00 

0-8 

0*010 

0-01 

0-05 

0-6 

0-044 

0-17 

0-22 

0*4 

0-117 

0-59 

0-59 

0*2 

0-205 

1-08' 

1-03 

00 

0-246 

1-29 

1-23 


n = 

20 


r 

Binomial 

terms 

i 

Correlation 

coefficient 

ordinates 

Binomial 
terms -j- 0-1 

0*8 

0-000 

0-00 

0-00 

0-7 

0-001 


0-01 

0-6 



0-05 

0-5 

I 

0-12 

0-15 

0-4 


0-35 

0*37 

0*3 


0*74 

0-74 

0*2 

0-120 

1-23 

1-20 

0-1 

0-100 

1-04 

1-60 

0-0 

0-170 

1-81 

1-70 


The comparison of the discrete and continuous distributions is difficult, but it is clear that, 
in view of the nature of the error distribution, the approximation is good for » as low as 20. 













264 Testing of serial correlation coefficients 

Example 3. The method of Example 2 can be used for the oaee when p + 0. If we consider 
the distribution 

P{x i+1 = l|x< = 1) = P(x i+1 = - l|x, _ - 1) - p, 

P( X i+l = l|*< ~ — 1) = P( x i+1 ~ ~ = 1) = 5'. 

Then E(x { x M ) = ( p-qf , and this will behave similar to a linear Markoff scheme with 
p = p—q, except that only the values ± 1 are admissible. In this case, r is distributed 
binomially with parameter p so that we can compare the discrete binomial with the con- 
tinuous theoretical distribution, as below. In this case, the fit is less satisfactory, since the 
binomial distribution is less skew, but, considering the approximate nature of the assump- 
tion, the fit is remarkable for n as low as 20. 


n — 20 


r 

p = f 

10 (binomial 
terms) 

p=i 

Correlation 

coefficient 

ordinates 

p = 1 

10 (binomial 
terms) 

P = i 

Correlation 

coefficient 

ordinates 

1-0 

0-00 

0*00 

0*03 


0*9 

0-03 

0*00 

0*21 


0-8 

0-14 

0-03 

0*07 

0*30 

0-7 

0-42 

0*24 

1*34 

1*20 

0*0 

0-91 

0*79 

1*89 

1*93 

0-5 

1*46 

1*45 

203 

2*09 

0-4 

1*82 

1*87 

1*09 

1*75 

0-3 

1-82 

1*87 

M2 

1*23 

0*2 

1-48 

1*54 

0*61 

0*75 

0-1 

0-99 

1*06 

0*27 

0*40 

0*0 

0-54 

0*03 

0*10 


-01 

0*25 

0*32 

0*03 

■ . 

-0-2 

0*09 

0*14 

0*01 


— 0*3 

0-03 

0*05 

0*00 

0*01 

-0-4 

0-01 

0*01 

— 

— 

— 0*5 

0*00 

0*00 

— 

— 


Example 4. Serial correlation coefficients were calculated, using the ordinary definition, 
from thirty sets of twenty-one serial correlated numbers. These numbers had been derived 
using the scheme x i+1 = O-Sx* + e i+1 , where the e i were rectangularly distributed from —49 
to 49. The thirty values of r thus obtained were 


-0-113, 

0-276, 

0-282, 

0-327, 

0-339, 

0-428, 

0-435, 

0-445, 

0-446, 

0-476, 

0-590, 

0-613, 

0-614, 

0-630, 

0-645, 


0-352, 

0-386, 

0-408, 

0-411, 

0-424, 

0-518, 

0-547,- 

0-550, 

0-562, 

0-577, 

0-649, 

0-672, 

0-677, 

0-718, 

0-744. 


These values were then transformed, and, in this case, we expect 2 to be distributed about 
mean 0-5 (0-5) (1-26) 


0-5494- 


with variance 


20(0-75) 400(0-76) 8 

1 0-5 


= 0-6209, 


- 0-0644. 


20(0-75) 400(0-75)* 

The estimated values of the mean and of the variance about the theoretical mean were 
0-557 and 0-0517, while g x = 0-024 ± 0-427 and g t — 0-785 ± 0-833 indicated no significant 
deviation from normality. 





266 


M. H. Quenouille 

Example 6. M. G. Kendall (1946) has calculated oorrelograms for eight sub-series of a 
series of 480 terms of the autoregressive scheme x i+i = l-la; <+1 — 0-6a; ( + 4j +J , where the ej are 
rectangularly distributed from - 49 to 49. 

We mijght test his eight .values of r x against the theoretical value p x = 0-733. In this case, 
the e ( in the soheme x i+x = p x x i + e M are correlated. The expected values of the mean and 
variance of the * distribution are easily found to be 0-9099 and 0-0363, and these may be 
compared with the mean, 1-0181, and the variance about the theoretical mean, 0-0464, 
calculated from the transformed values, 0-7498, 0-9181, 0-9223, 0-9439, 0-9962, 1-0362, 
1-1786, 1-4007. The normality can again be tested giving 

g x = 0-667 ± 0-762, g 2 = 0-876 ± 1-481. 

Example 6. Sir Gilbert Walker (1946) has calculated serial correlation coefficients for 
pressure data observed at daily intervals. His values for the period October to March in 
the years 1930-6 were 0-76, 0-88, 0-80, 0-88 0-79 and 0-82. These can be transformed and 
tested giving a 2 = 0-0216 compared with the theoretical <r 2 = 0-0174 approximately. 

A similar set of unpublished data calculated for six successive months gave values of r x 
equal to 0-62, 0-76, 0-86, 0-78, 0-60, 0-86 with a 2 = 0-0688 and <r 2 = 0-0803 approximately. 

Example 7. The Beveridge series of trend-free wheat-price index numbers was split into 
sixteen subseries of twenty-three terms each, and serial correlation coefficients were cal- 
culated for these subseries. The values obtained were 0-379, 0-460, 0-329, 0-421, 0-772, 0-464, 
0-678, 0-767, 0-728, 0-311, 0-497, 0-666, 0-270, 0-508, 0-691 and 0-689 respectively. These 
were transformed, and the mean and variance of the transformed values were found to be 
0-625 and 0-0625. .If we assume p = tanh 0-625 = 0-5645, then 

°* = 22(0-6925) ~ 484(0-6925) 2 = °' 0631 ’ 
which agrees remarkably well with the observed variance. 


Summary 

We may summarize the results of these examples in tabular form, as below: 


Example 

P 

■ 

or* 

8 2 

Degrees of 
freedom 


P 

1 

00 

0-0500 

0-0530 

20 

21-20 

0-39 

4 

0-5 

0-0644 

0-0517 

30 

24-08 

0-77 

5 

0-733 

0-0353 

0-0454 

8 

10-29 

0-25 

6(o) 

0-827 

0-0174 

0-0215 

5 

6-18 

0-29 

(b) 

0-765 

0-0803 

0-0688 

5 

4-28 

0-51 

7 

0-554 

0-0631 

0-0625 

15 

14-86 

0-46 

Total 




83 

80-88 

0-55 


Thus, while the results given here do not constitute definite proof, there is every indication 
that the approximate normal theory provides a satisfactory test for serial correlation 
coefficients, if the number of observations is sufficiently high, and such a test is undoubtedly 



206 Testing of serial correlation coefficients 

useful in numerous ways, varying from the determination of density of observations in 
sampling enquiries to the comparison of ‘ after-effects ’ in biological or economio research. 

It is worth noting that the method of Examples 2 and 3 can be used to investigate the 
approximate test commonly used to test the correlation between two serially correlated 
series. If the correlation between successive terms of the two series are p x and p t , then the 
correlation between corresponding terms of the series is usually tested with 

n' = »( l-PxPiW+PiPt) 

degrees of freedom, i.e. as if the correlation were based on »' pairs of observations. This 
result was first proved by Bartlett (1935). 

Suppose we consider two sohemes of the kind described in Example 3, with parameters 
Pi - Pi ~ Qi and p 2 = Pi - ? a - Then if 

P = PiPi+9i9a = U l +PiP2)> 9 = Px%+Ps<h = IV-PiPt) and P"*P-Q m PiP* 
it is not difficult to show that the distribution of the correlation r between the two schemes 
is given by 

P(r = l) = P(r = -l)=p«-i, 

p|r= 1_ n) = P n ~\ + (\n- + l) 2 -r 2 (|n) 2 ]^ n_ V 

+ ~ 2 ) - 1 )* • - r 8 ( 

+ (^)i ~ 2 ) 2 “ r*(i») 2 ] [(i w ~ 1 ) 2 - r*( \nf]p n ~*q 6 + . . . . 

The ordinates of this distribution are given below, for n — 20 and 


II 

© 

0-8, 

0-5, 

0-0, 

-0-5, 

-0-8, 

10, 

p - 1-0, 

0-9, 

0-75, 

0*5, 

0-25, 

o-i, 

0-0, 

n' = 0, 

2-2, 

6-7, 

20-0, 

60-0, 

180-0, 

oo. 


\ P 

r \ 

1-0 

0*8 

0-5 

0 

-0-5 

-0-8 

-1-0 


0-5 

0-0676 

0-0021 





0*9 

0-0 

0-0300 

0-0050 

— 

— 

— 

— 

0*8 

0-0 

0-0340 

0-0118 

■ ■ 

— 1 

' — 

— 

0-7 

0-0 

0-0389 

0-0210 


— 

— 

— 

0-0 

0-0 

0-0430 

0-0332 


— 

— i 

— 

0*5 

0-0 

0-0466 

0-0475 



— 

— 

0-4 

0*0 

0-0498 

0*0029 

0-0370 


— 

— 

0-3 

0-0 

0-0623 

0-0770 



0-0013 

— 

0-2 

0-0 

0-0642 

0-0898 

0-1201 


0-0203 

— 

0-1 

0*0 

0-0663 

0-0980 



0-2219 

— 

0-0 

0-0 

0*0667 

0-1008 



0-5010 

1-0000 


A comparison of these values with tables of the distribution of the correlation coefficient 
constructed by David (1938) shows that they are distributed to a high degree of approxima- 








267 


M. H. Quenouille 

tion with n' + 2 degrees of freedom. The extra two degrees of freedom are more likely due to 
the nature of the error distribution than to the approximate nature of n', since they are 
independent of p. In any case, the conclusion reached is that the approximation is valid 
provided that the effective number of degrees of freedom is large. 


REFERENCES 

Anderson, R. L. (1942). Ann . Math . Statist. 13, 1. 

Bartlett, M. S. (1935). J. B. Statist. Soc. 98, 536. 

David, F. N. ( 1938). Tables of the Ordinates and Probability Integral of the Distribution of the Correlation 
Coefficient in Small Samples . Cambridge University Press. 

Kendall, M. G. (1946). Contributions to the Study of Oscillatory Time Series . National Institute of 
Economic and Social Research. 

Koopmans, T. (1942). Ann. Math . Statist . 13, 14. 

Madow, W. G. (1945). Ann. Math. Statist. 16, 308. 

Rubin, H. (1945). Ann. Math. Statist. 16, 211. 

Walker, Sir G. (1946). Quart. J.R. Met. Soc. 72, 265. 



[ 268 ] 


FRACTIONAL REPLICATION ARRANGEMENTS FOR FACTORIAL 
EXPERIMENTS WITH FACTORS AT TWO LEVELS 

By K. A. BROWNLEE, B. K. KELLY and P. K. LORAINE 
The Research Department of the Distillers Company Ltd., Great Burgh, Epsom, Surrey 

The theory of fractional replication of factorial experiments has been given recently (Finney, 
1945; Finney, 1940; Kempthome, 1947). The present paper is a description of the solutions 
of practical value in the case of the 2”~ m experiment (that is, the (» — m)th replicate of » 
factors all at two levels, involving the use of 2 m plots). Such confounding and double oon- 
founding arrangements as have been found will be given. 

Two useful solutions are also included where one and two factors respectively are at four 
levels, these being derived from solutions with all factors at two levels. 

It will be assumed that the reader is familiar with the basio concepts of fractional replica- 
tion as set out by Finney (1945). These briefly are the group properties of treatment symbols 
and effect symbols, the use of generators to define concisely groups and subgroups, the 
concept of orthogonality between two subgroups, and the use of high-order interactions as 
aliases of main effects and first-order interactions. 

In confounding with fractional replication the confounding subgroup with its aliases 
cannot contain any element in common with the alias subgroup except the identity and for 
practical purposes should not contain main effects nor, as far as possible, first-order inter- 
actions. 

Fisher (1942<z) introduced double confounding in which the two confounding subgroups 
must not contain any element other than the identity in common. In fractional replication 
there is the additional restriction that there should be no overlapping of the aliases. 

To obtain the actual arrangement of plots in any particular simply confounded experi- 
ment, having selected the alias subgroup and the confounding subgroup, we form the so-called 
‘principal block’ from those treatments whose symbols are orthogonal to both these sub- 
groups. Subsequent blocks can be obtained by multiplying the principal block by symbols 
orthogonal to the alias subgroup which have not previously occurred. With double con- 
founding these multipliers are of course the treatments of the principal block corresponding 
to the second confounding subgroup. 

Enumeration of subgroups ' 

Kempthome (1947) reports that no simple method of enumerating subgroups suitable for 
high-order fractional replication has been found. It seems worth while, therefore, to record 
a method which has been found satisfactory for all designs of practical interest. It inevitably 
becomes more laborious for higher order subgroups, but such an enumeration is likely to be 
of sufficient use to justify the time spent in arriving at it. 

Subgroups of order 2 m of the effects group involving n factors can be subdivided into 
types, two subgroups being considered of the same type when a permutation of the n letters 
representing the factors converts one into the other. Eaoh type may be represented by 

as y mbo1 K, n„ n„ ...), 



K. A. Brownlee, B. EL Kelly and P. K. Loraine 289 

where the (2 m — 1) numbers n lt n t , ete., are the numbers of letters appearing in the separate 
elements of subgroups of that type. The symbolism may be further condensed by attaching 
to each number a suffix indicating how many times it is repeated. Thus the alias subgroup 
given below for the" ^ replicate of 8 factors would be represented by the symbol 

(4 14 , 8). 

This symbolism is not perfect, as examples do occur where the same symbol represents 
more than one type. In such cases, the separate types will be distinguished by suffices 
outside the brackets. 

The following conditions are necessary for a symbol to represent a type of subgroup : 

(1) The sum of all the numbers in the brackets must equal 2 m_1 times the number of 
letters actually used in the type. 

(2) The numbers in the brackets must be all even or 2 m_1 of them must be odd. 

(3) When 2 m ~ 1 of the numbers are odd, the even numbers must, with the identity, form 
a subgroup of order 2 m_1 . 

(4) If the number n appears, the remaining numbers must be divisible into pairs such that 
the total of each pair is n. 

(5) If the number 1 appears, the remaining numbers must be divisible into pairs such 
that the numbers in each pair differ by 1 . 

No other necessary conditions appear susceptible to expression in simple general terms. 

The actual process of enumeration is best demonstrated by an example. Suppose it is 
required to find all the subgroups of order 8 involving 7 letters. The first step is to find all 
the combinations of 7, i.e. 2 m — 1, even numbers which total to 28, i.e. n x 2 m_1 . Each of 
these is taken in turn and 2, i.e. 2 m2 , of the numbers are increased by 1 and the same number 
reduced by 1 in every possible way, bearing in mind the total number of elements containing 
a particular number of letters in the complete group. Next any duplication introduced by 
this process (some symbols will be derived from more than one combination of even numbers) 
is eliminated. The result will be all the symbols satisfying conditions (1) and (2) above. 
Further symbols are then rejected through failure to satisfy conditions (3), (4) and (5) 
above. 

The final step consists in the examination of the remaining symbols for the existence of 
corresponding subgroups. This is best carried out by working from the even subgroup of 
order 2 m_1 contained within it. Any one of the odd numbers may be taken for the last 
generator, and the number of factors this element must have in common with each of the 
even elements can be determined by the fact that the products must give the other odd 
elements. It can then be readily determined whether such a generator exists. The all-even 
symbols may be examined in a similar way except that there is a wider choice for the sub- 
group of order 2 m ~ 1 . At this stage, symbols representing more than one type of subgroup 
must be watohed for. Considerable assistance can be obtained here by carrying out a check 
of the total number of subgroups. 

The total number of subgroups of order 2 nt in the group of order 2 n is given by (Carmichael, 

1937 ) (2 n — 1) (2 n — 2) (2 n — 2 2 ) ... (2 B — 2 m ~ 1 ) 

(2 m — 1) (2 m — 2) (2 m — 2 s ) ... (2 m — 2 m_1 ) ' 

Some of these subgroups, however, do not use all the n factors. The number using all factors 
can be determined by a calculation exemplified in Table 1. The second column is derived by 



270 Fractional replication arrangements for factorial experiments 
applying the formula given by Carmichael. The third column is derived from the second 
column by subtracting from each entry. The fourth column is derived from the third 

by subtracting 11 times from each item and so on until the single entry in the sixth 


column is obtained. This gives the number of subgroups of order 8 which employ all the 
7 factors, 7000 such subgroups employing 6 factors or less. 


Table 1 


(1) 

Factors 

n 

(2) 

Total subgroups 
of order 8 

(3) 

(4) 

(6) 

(«) 

3 

1 





4 

15 

11 

— 

— 

— 

5 

155 

145 

90 

— 

— 

6 

1,395 

1,375 

1,210 

670 

— 

7 

11,811 

11,776 

11,391 

9,501 

4 811 


When the existence of a subgroup corresponding to any symbol has been established, the 
number of subgroups of this type can be found by solving the appropriate combinatorial 
problem after one member of the type, or merely a set of generators, has been written out in 
full. Failure of the total to agree with that found as above after all symbols have been dealt 
with indicates, in the absence of errors, that at least one symbol represents more than one 
type of subgroup. In searching for this the all-even symbols should be examined first as 
these are usually, if not always, the source of the trouble. 

If only subgroups containing no elements with less than five letters are considered of 
interest, the work can be considerably lightened by carrying out the above process but only 
starting from the all-even combinations which contain no numbers less than 4 and, in 
enumerating from these, only considering symbols in which all the 4’s are raised to 5. 

In this case, however, there does not seem to be any numerical check, so that there is 
some risk of missing one or more types through their being represented by the same symbols 
as other types. The risk is not very serious, however, since types represented by the same 
symbol have most properties which are important from the point of view of fractional 
replication in common. 


Designs based on sixteen plots . 

There are designs based on 16 plots for 6, 7, or 8 factors (J, and ^ replicates respectively), 
which give main effects with second-order interactions as aliases. First-order interactions 
must be used as error, so if these exist and are large the residual will be inflated accordingly. 

The most general design is that for 8 factors. It is based on the alias subgroup of the type 
(4 U , 8) generated by A BCD, CDEF, AC EG, EFGH. 

This can be confounded in 4 blocks of 4, confounding AB, BC and AC, with 

(1), abed, efgh, abcdefgh, 

a 8 the first block and subsequent blocks being obtained by multiplying this block succes- 
sively by abqf, aceg, adeh. 



271 


K. A. Brownlee, B. K. Kelly and P. K. Loraine 

The designs for 6 and 7 factors can be obtained from the above. The alias subgroups for 
these two oases are obtained by deleting those generators containing letters beyond the 
sixth and seventh respectively, and the plot treatments are obtained by deleting the letters 
beyond the sixth And seventh respectively where they occur. 

Designs based on thibty-two plots 

The two extreme designs based on 32 plots are the half -replicate of 6 factors, which gives its 
main effects as aliases of fourth-order interactions and its first-order interactions as aliases 
of third-order interactions, and the replicate of 10 factors, which gives its main effects 
as aliases of second-order interactions and no first-order interactions. Between these two 
extremes come designs in which the order of aliases falls as the degree of completeness of 
replication falls. 

The possibilities are set out in Table 2. The confounding arrangements for 6 and 7 factors 
can be used for double confounding, since in each case the two confounding subgroups and 
their aliases contain no element in common other than the identity. However, in the case 
of 7 factors, where it is desired merely simply to confound in 4 blocks of 8, it is better to use 
the confounding subgroup I, ACF, ACO, FO as FO is already lost, so nothing further is 
lost by this restriction. 

It will be noted that the alias subgroup for the ^ replicate has the structure (4 7 , 5 7 , 9). 
At first sight one would have expected the subgroup generated by A BCD, ABEF, A BOH, 
ACEGJ with the structure (4 6 , 5„, 8) to be more satisfactory, but in point of fact, with this 
alias subgroup, only one factor retains its first-order interactions. 

Designs based on sixty -four plots 

The designs for 64 plots proceed from the half-replicate of 7 factors with a high degree of 
security in its aliases, first-order interactions having fourth-order interactions as aliases, 
to the ^ replicate of 1 1 factors with all but three main effects having only second-order 
interactions as aliases. 

The possibilities are shown in Table 3. With 8 blocks of 8, in all cases double confounding 
is possible with zero or slight loss of first-order interactions. With 16 blocks of 4 and 4 blocks 
of 16, satisfactory doubly confounded arrangements have been found only for 7, 8 and 11 
factors, but simply confounded arrangements are given for 9 and 10 factors. 

Designs with more than sixty-four plots 

For experiments with more than 64 plots tho investment in most fields of experimentation 
is so great that it becomes essential to have a higher degree of certainty in the results, and 
hence the degree of ambiguity which can be tolerated with respect to the aliases is lower, 
i.e. the order of the interactions forming the aliases must be relatively higher. 

The possibilities are set out in Table 4. The alias subgroup for 12 factors is due to Fisher 
(19426). In all these designs the aliases of main effects are fourth-order interactions or 
better, and the aliases of first-order interactions are third-order interactions or better. The 
confounding arrangements given do not lose any first-order interactions. 

There are certain less secure designs which may prove useful in certain circumstances and 
these are briefly summarized in Table 5. 



Table 2. Designs based on 32 plots 












Table 3. Designs based on 64 plots 


Ms 
tototototo 
2 ■£ tototoSto 


OOuhO® '’iS'’) 
M eq <n 


lit | 

's I 


totoS s ■§, 

to VS" 


fe< tsj cs t*S to 
to 


to & 

_ , 

2 ■£ 5toOto 
totototo 


toM 

tototo 

<3> H. ototo 

totoO 


O © *** t- to IN 


tototo to to 10 si>l 
toto I.S e> 


w m* 

i 

t 'itf 


tototo to JS*. 

«i3m l^<> 


° ^ 


Mtotos to 
^MM«l o 


Hi* 


* ft 


° w 


tostoto totototototn ~ 
.'Sto^M aq-to tototo to 


&3§fci toto ” <5 


3 | © 01 “ w ® © '>! to toto 


tototo 
to to to 


•«h 


tototo to ** » totototo tototototototo * S§. 

to slU 3 ^tototo -ijajtotoCi^to It 

3 < 

P Q 

III 1 -m M 1 n?i 

tototo | o s'V-S’ ,too§totoatoto » >j? 

^toto ill l^tototo -stototo II 


O A* Q O 

uh 

If s | 

lit "4 


llfill |l 

ll|1!l || 

's'sj's's's jg 

o o § o o o §§ 

jzjtotolglzito Qto 


Hi i 


mill! Ill 


1-9 ® £ 

si § | 

I'll 

I s $ 



274 Fractional replication arrangements for factorial experiments 


Table 4. Designs with more than 64 plots 


No. of factors ... 

9 

12 

id 

Order of replication 

i 

A 

tb 

No. of plots 

128 

256 

612 

Generators of alias subgroup 

ABC DBF 

ABEFLM 

ABCDEFQH 


BEFQHJ 

ABQHJK 

EFGHJKLM 


— 

ACEOKM 

ABEFJKNO 


— 

ADFGJM 

ACEGJLPQ 


— 

— 

ADEHJMNP 


— 

— 

ADEGNQ 

Confounding in 8 blocks : 

— 

— 

AGJMNO 

' Generators of confounding subgroup 

ADO 

ABE 

AGE 


i BEH 

CDE 

BDE 


CFJ 

ACO 

ADF 

Generators of principal block 

acdf 

abed 

abed 


abde 

bcegjk 

jklm 


degh 

aceMm 

nopq 


cfhj 

jldm 

adegjk 


— 

fhji 

bdfhkmn 


— 

— 

ghjloq 

Multipliers by whioh subsequent blocks are obtained 

ab 

bceh 

acehp 


be 

afk 

bcafn 


ac 

bfj 

abfhnp 


cfj 

bek 

cgmpq 


rfO 

aej 

aeghrnq 


cfh 

adjm 

befgmo 


cdh 

efjk 

dfghmnq 


Table 5 


No. of factors 

10 

11 

13 

Order of replication 

i 

A 

A 

No. of plots 

128 

128 

256 

Generators of alias subgroup 

ABGDQ 

ADEKL 

ABC DBF 


ABEFH 

AFGHJ 

DEFGHJ 


AOUJK 

BDFJL 

GHJKLM 


— 

GDGJK 

ADGKN 


— 

— 

BEHKM 

Order of alias of main effects 

Third 

Third 

Third 

No. of first-order interactions : 




With third orders as aliases 

17 

7 

34 

With second orders as aliases 

28 

48 

44 


In the case of the 2 1 ®/ 7 , for confounding in 4 blocks of 32, the confounding subgroup can be 
I = CEJK = ABOH = ABCEOHJK, 

and for 8 blocks of 16 the further generator DFOH can be added, when the two first-order 
interactions AO and CE are lost. 




275 


K. A. Brownlee, B. K. Kelly and P. K. Lorainb 

Experiments with some factors at four levels 
A factor at four levels can be regarded as two factors at two levels, the third degree of freedom 
coming from their interaction. In adopting solutions obtained for factors all at two levels, 
this interaction must be regarded as a main effect from the point of view of its occurrence 
in the sets of aliases. This makes the problem of obtaining satisfactory arrangements more 
difficult, and here only two solutions will be given. 


(a) The half-replicate of one factor at four levels and four factors at two levels 
We can use I = ABODE F as the alias subgroup, / = ABC = ADE = BODE * as the 
confounding subgroup for 4 blocks of 8, and allocate B and D to represent the four-level 
factor. This loses AF, and leads to the arrangement in Table 6, where the first number in 
each treatment symbol represents the four-level factors and the remainder the factors 
A, C, A’ and F. 

Table 6 


10000 

21011 

00111 

10011 

21000 

00100 

20] 11 

11100 

30000 

01101 

30110 

11010 

oiuo 

30101 

11001 

20100 

inn 

30011 

31010 

00001 

21101 

31001 

00010 

21110 


31100 

31111 

01011 

20001 

20010 

01000 

10110 

10101 


(6) The half -replicate, of two factors at four levels and three factors at two levels 
We use the alias subgroup I - ABCDEFO, the confounding subgroup 

I = AEG = CFG = ACEF, 

and allocate A and B to the first four-level factor and C and D to the second. All the degrees 
of freedom for the main effects have third -order interactions as aliases, and all the degrees 
of freedom for the first-order interactions have second -order interactions as aliases. The 
resulting arrangement is in Table 7. 

Table 7 


11000 

31000 

13000 

22000 

01111 

21111 

03111 

32111 

32001 

12001 

30001 

01001 

22110 

02110 

20110 

11110 

00000 

20000 

02000 

33000 

10111 

30111 

12111 

23111 

23001 

03001 

21001 

10001 

33110 

13110 

31110 

00110 

21100 

01100 

23100 

12100 

30100 

10100 

32100 

03100 

20011 

00011 

22011 

13011 

31011 

11011 

33011 

02011 

12010 

32010 

10010 

21010 

03010 

23010 

01010 

30010 

13101 

33101 

11101 

20101 

02101 

22101 

00101 

31101 


Acknowledgements are due to the Directors of the Distillers Company Limited for 
permission to publish this paper. 





276 Fractional replication arrangements for factorial experiments 

REFERENCES 

Carmichael, R. D. (1037). Introduction to the Theory of Groups of Finite Order, $28. Boston: Ginn. 

Finney, D. J. (1945). The fractional replication of factorial arrangements. Ann . Eugen., Lond., 12, 
291-301. 

Finney, D. J. (1946). Recent developments in the design of field experiments. III. Fractional replica- 
tion. J . Agrie. JSd. 36, 184-91. 

Fisher, R. A. (1942a). The Design of Experiments, 3rd ed., $45.2. London: Oliver and Boyd. 

Fisher, R. A. (19426). The theory of confounding in factorial experiments in relation to the theory 
of groups. Ann . Eugen ., Lend., 11, 341-53. 

Kkmpthorne, O. (1947). A simple approach to confounding and fractional replication in factorial 
exgferiments. Biometrika , 34, 255-72. 



[ 277 ] 


THE RELATIONSHIP BETWEEN FINITE GROUPS AND COM- 
PLETELY ORTHOGONAL SQUARES, CUBES, AND HYPER-CUBES 

By K. A. BROWNLEE and P. K. LORAINE 
The Research Department of the Distillers Company , Ltd., Epsom , Surrey 

Completely orthogonal squares of side n afford a means of introducing (n-f 1) factors all 
at n levels into an experiment involving n 2 plots instead of the n n+l that would be required 
for a full factorial experiment. It is pertinent to inquire how far interactions between pairs 
of factors will affect the estimates of the main effects of the other factors. The theory of 
fractional replication, as based on the theory of finite prime power groups by Finney (1945), 
allows an approach to be made to this problem. Completely orthogonal hyper-cubes will 
also be investigated. 

Finney has indicated the theory of fractional replication of experiments involving n factors 
each at n levels, where n is any prime. The treatment combinations can be represented by 
symbols a a b^cy ..., where the indices a,/?,y, ... take only the values 0, 1, 2, ... (n— 1). If the 
symbols are interpreted according to the ordinary laws of algebra, with the additional 
condition a” = b r T = c* = = 1 

then it is found that the product of any two symbols is a third. The complete set of symbols 
for a full factorial experiment form a prime power group of modulus n and order n n . 

Two symbols a*bPcv ... and a^b^c ^ ... are said to be orthogonal if 

oa' + yy' + ... = 0 (mod 7r). 

If we have a subgroup of order n p it is possible to select a second subgroup of order n n ~ p 
such that all its elements are orthogonal to those of the first subgroup; this second subgroup 
is said to be the complete orthogonal subgroup of the first subgroup. Finney has shown that 
if the first subgroup represents the treatments carried out in a fractional replication of a 
factorial experiment then the effects are interconfused in sets of ( n — p ), and to find the 
aliases of any effect we multiply the second subgroup by that effect. The second subgroup is 
therefore called the alias subgroup. 

Formation of completely orthogonal squares of any prime number 

To construct a completely orthogonal square of side tt, where n is a prime, we need to allocate 
to each of the n 2 plots their row and column numbers and their levels of the remaining 
(tt — 1 ) factors. The square can be considered either to define the levels of (n+ 1) factors or 
to define the letters of (n — 1 ) alphabets in a square. 

Consider the treatment subgroup of n 2 elements generated by x x P v x 2 P 2 , where 

P x = abc ...w, P 2 = ab 2 c 3 . . . w n ~ x , 

and w is the (n— I)th alphabet. The exponents of x x and x 2 , reduced to modulus n , will run 
from 0 to (7i — 1 ) and can therefore be used to define the cells in the square with co-ordinates 
y v y 2 . The element corresponding to cell (y v y 2 ) will be (o; 1 P 1 ) v » (x 2 P 2 ) v * and the levels of the 
(tt — 1 ) alphabets are the exponents, reduced to modulus 7r , of the letters a to w in this product. 

Biometrika 35 18 



278 Finite groups and completely orthogonal squares and cubes 

It is obvious that every level of each alphabet occurs j ust once in each row and each column. 
To see that this square is completely orthogonal, we need also to show that each level of any 
alphabet occurs onoe and once only with each level of any other alphabet. Suppose that the 
levels of the rth and ath alphabets (r=M) are the same for two different cells (p 1( p 2 ) and 
(?i> ?a)- Then if their exponents in P t are Pi and <r i respectively (i = 1, 2), we have 

PiPi+PzPi = Piqi+Ptia, 0 iPi + 0 g?>* = <r x q x +(T 2 q a 

and since p t *¥q a , this leads to 8l — (1) 

Pt 

Here Pi = 1 + (t- 1) (r- 1), <r t = l + (t-l)(#-l), (i = 1,2), 

which gives the contradiction r = s. Hence the square is completely orthogonal. 

The present treatment using prime power groups is essentially that of Stevens (1939). 
We can proceed to derive the alias subgroup of order n” 1 completely orthogonal to the above 
treatment subgroup of order n a . It can be generated by X 1 X 2 A"~ 1 and the (n — 2) other 

elements of the form X[A n - ll Q l+1 , where t = 1,2 it— 2 and is the «th alphabet. For 

example, in the case of n = 6 the treatment subgroup generators are x x abcd, x 2 ab 2 c?d*, and 
the alias subgroup generators are X x X 2 A *, X x A 3 B, X\A 3 C, X\AD. 

SQUABBS OF SIDES 4 AND 8 

The present treatment using prime-power groups is not immediately adaptable to the case 
of squares of sides of the form n”, but we can deal with the case of 2 a and 2 3 in the following 
manner. 

Fisher & Yates (1943) give as the treatments for the completely orthogonal square of 
side 4, writing in order columns, rows and the three alphabets, 

11111, 12234, 13342, 14423, 21222, etc. 

Let us represent each four-level factor by two two-level factors, so that A , li and A li corre- 
spond to the 3 degrees of freedom for the first four-level factor, etc. Thus 

1 = 6, 2 = (1), 3 = a and 4 = ab. 

The treatment combinations then can be written 

bdfhk, bgjk, bcegh, bedefj, d, etc. 

This particular set of elements does not form a subgroup, as it does not contain the identity. 
It can be converted into a subgroup, however, by multiplying through by any of its elemonts. 
Selecting d as the multiplier, we get 

bfhk, bdgjh, bedegh, bcefg, (1), etc. 

This subgroup can be generated by 

bfhk, bdgjh, cehjk, aegj. 

We have 5 four-level factors or 10 two-lovel factors, so the full factorial experiment will 
have 4 6 = 2 10 elements and the alias subgroup completely orthogonal to the treatment 
subgroup will have 2 10-4 = 2 6 elements and can be generated by 

BDF, ACE, BEOH, AEFH, AFJK, DHK. 

The treatment of squares of side 8 is similar to those of side 4. We represent the first 
8-level factor by the three 2-level factors A, B and C, etc., such that 

1 = (1), 2 = a, 3 = 6, 4 = 06, 5 = c, 6 = oc, 7 = 6c, 8 = o6c. 

The treatment subgroup of 8* = 2* elements can be generated by 

dgmopqr8tuioya, ehkprstuvwxz ft, fjlmnopqrtvxz, agknqtwz, bMoruxa, cjmpovyfl. 



K. A. Brownlee and P. K. Loraine 


279 


The complete group corresponding to the full factorial experiment will have S 9 = 2 s7 elements, 
so the alias subgroup completely orthogonal to the above treatment subgroup will have 
2 27 ~ 6 * 2 81 elements. 

The selection of 21 elements from the 2,097,152 to act as generators is not too easy, but 
a systematic method of approach leads to 

CDFN , BMP , DNQ , DHV, DTZ , BFL, FGHJKLM , 

FPS , BGNO , CM8V , F7W, DPX, CHOP , FF/ff, 

A//<2F, IVIFXa-, FJfF, JiVPT, A JOG, PFtf, C*V. 


Squares with less than the complete number of factors 

As they represent the most general condition, we have discussed the completely orthogonal 
squares. The above treatments can, however, be immediately adapted to squares with less 
than the complete number of factors. 

To consider a Latin square of side 5, for example, the complete treatment group for the 
full factorial experiment has 5 3 elements. The treatment subgroup as before has 5 2 elements, 
and these can be obtained from those previously given by missing out the last three letters, 
i.e. the generators are x x a , x 2 a. The alias subgroup will now have 5 3 ~ 2 = 5 elements and may 
be generated from X 1 X 2 A A . 

For a Graeco-Latin square the alias subgroup will have 5 4 ~ 2 = 5 2 elements, and we add 
the second generator given previously, X x A*B. Similarly, for a hyper-Graeco-Latin square 
the alias subgroup has 5 6-2 = 5 3 elements, and we add the third generator given previously, 
X\A*C. 

In all cases above for each factor omitted the appropriate treatment subgroup is obtained 
by missing out from the generators the letters corresponding to the omitted factor, and the 
appropriate alias subgroup is obtained by omitting those generators containing tetters 
corresponding to the omitted factor. 


Relationship between main effects and interactions between other factors 

In all cases for squares ranging from Latin to completely orthogonal the alias subgroups 
contain three-letter elements of the type A a B^C y , and every factor occurs in such elements. 
Thus all the degrees of freedom for the A main effect have as aliases terms of the type BP'C?', 
i.e. first-order interactions. In using orthogonal squares for experimental designs it is there- 
fore essential that no pair of factors should be liable to interact. In view of the unsatis- 
factory nature of this result, one is led to inquire whether Latin or completely orthogonal 
cubes or hyper-cubes will provide a sounder basis for experimental designs. 


Orthogonality in three dimensions 

Consider a treatment subgroup generated by elements of the form x x P l9 x 2 P 2 , x s P 2 , where 
P x = abed . . . w, P 2 = ab 2 c 3 d A . . . w n ~ l , P 3 = ab 3 c s d 7 . . . w 2n * 3 , 

in whioh w is the (7r— l)th letter, and whenever the exponent of one of the letters a to w 
becomes 0 (mod n) such a letter is deleted from all the generators. The letter to be deleted 

will in fact be the ^^~jth. This treatment subgroup of n 8 elements can be considered to 

define the levels of (ar-f 1) factors or the levels of (tt — 2) alphabets in a cube, the position 



280 


Finite groups and completely orthogonal squares and cubes 

co-ordinates in this cube being given by the x { . The alphabets are numbered before the 
deletion is made and for convenience this numbering will be retained. The levels of the 
alphabets in cell (yx,y%,y 3 ) are given by the exponents of the letters a to w in the element 
( x i p i) Vl ( x i P t) v * ( x » p s) Vt ' 

The condition for such a cube to be Latin is that in any ‘file’ defined by two of the y t 
being constant, each of the n levels of each alphabet should occur onoe and once only. This 
condition is obviously fulfilled here. 

For the cube to be completely orthogonal we must also have that each plane seotion 
= const, (i = 1, 2, 3) forms an orthogonal square. The arguments used above to obtain 
( 1 ) hold here if suffices j and k are substituted for 1 and 2 throughout (j, k = 1,2,3). Therefore 

we have _ _ 

Cl = cr l 
Pk &k 

where = 1 + (i— 1) (r— 1), <r i = 1 + (i— 1) (s— 1), (i — j,k), which leads to r = & as before. 
The alias subgroup will be of order n n > and has (n — 2) generators of the form 

TT 1 Tf ^ 

where t — 1,2, ..., 7T-2, and is the «th alphabet. 


Orthogonality in four or more dimensions 

Considerations analogous to those for three dimensions show that for four dimensions the 
treatment subgroup generators will be 


x 1 abc ... w, x 2 ab 3 c 3 . . . w n ~ l , x 3 ab 3 c 6 . . . w 3v 3 , x^allc" . . . iv 3n r ’, 
where, as before, whenever the exponent of one of the letters a to w becomes 0 (mod n), 
such a letter is deleted from all generators. Thus the ^ jth letter will be deleted and also 


the 



th or the 



th, according as n is of the form (3 k + 1 ) or (3 k + 2), k being an 


integer. 

The generalization to n dimensions is now obvious. The treatment subgroup will have n 
generators of the form x i P i , where 


Pi = ( i = 1, 2, ...,n), 

deletions being made as before. After deletions there will thus always t>e a total of (zr +1 ) 
classifications or factors. 

The proofs of complete orthogonality follow exactly as for three dimensions. 

The treatment subgroup is of order n n and therefore the alias subgroup is of order n n+1 ~ n 
and has generators of the form 

X a X a .-..X n ^-\ 

xi+ l xi+o-‘xi-' •' x; - 2 -' . . . a *-'- 1 a <+1 , 

where is the «th alphabet and, if m v m t m„_ s are the numbers in ascending order of 

the omitted alphabets, then 

t = 1,2, .... (mi — 1), (wij-f 1), ..., (w 2 — 1), (m 8 + 1), .... 1), (m n _j+l),...,Jr-2. 



K. A. Brownlee and P. K. Loraine 


281 


Experimental designs : Latin hyper-cubes 

Inspection of the generators for the treatment subgroups shows that a Latin (i.e. having 
only one alphabet) hyper-cube in any number of dimensions is always possible for any prime. 
For example, we can have a Latin hyper-cube in four dimensions for n - 3. This will have 
5 classifications and be based on 3* cells; it is thus a solution of the one-third replicate of 
5 factors at three levels. This solution is not actually the same as given by Finney (1945) as 
he selected one more satisfactory for confounding. Similarly the one-fifth replicate of 
0 factors at five levels can be generated immediately and its alias subgroup will have the one 
generator X 1 X % X 9 X A X b A i thus allowing first-order interactions to be estimated with third 
orders as aliases. 

Experimental designs: Graeco-Latin hyper-cubes 

In view of the form of the generators for the treatment subgroup, and in particular the 
property of elimination of letters where exponents become 0 (mod n), it is clearly impossible 
to generate, with this method, w-dimensional hyper-cubes other than Latin except when 
n^n+ 1. 

In the case of Graeco-Latin hyper-cubes we have two alphabets and n co-ordinates making 
up (2 n) factors in n n plots. The order of replication is therefore n 2 , and the alias subgroup, 
generated by 

X x X 2 ... X n A”~\ X Y Xl 1 ... X :Z-in-2) A n -*B, 

will have no element with less than (n+ 1) letters. 

In the case of three dimensions, for n = 3 only a Latin cube is possible. With n'Z 5, the 
Graeco-Latin is possible, and the main effects of all factors will have second-order inter- 
actions as aliases. For example, with n = 5 we can have 5 factors with 5 s = 125 plots. 

In the case of four dimensions, rr must be ^ 5. For exahiple, with n = 5, the two alphabets 
and four co-ordinates make 6 factors in 5 4 = 625 plots. Main effects have third-order inter- 
actions and first-order interactions have second-order interactions as their lowest order 
aliases. 

On going to five dimensions the minimum prime is 7 and hence the minimum number of 
plots is 7 6 — 16807. 

Experimental designs : completely orthogonal hyper-oubrs 

Consideration of the general alias subgroup shows that if this is of order n m , m ^ 3, then its 
generators must include (m — 1) of the type 

XiXJ-L.. XI 

Since the subgroup generated by any pair of elements of this type contains elements of the 
form A*BPCy, and since every alphabet occurs in such elements, all the factors corresponding 
to the alphabets include first-order interactions amongst their aliases. The factors corre- 
sponding to the co-ordinates will be better, how ever, having as aliases interactions of order 
(n- 1 ). 

For example, with n = 5, in three dimensions, we can have three alphabets. The total 
number of plots is 5 3 = 1 25 and the order of replication 5~ 3 = 1/125. Three factors have first- 
order aliases and three factors have second-order aliases. With n = 7, the total number of 
plots is 7 3 = 343, being a 7 6 = 1/16807 replicate. Five factors have first-order interactions 
as aliases and three have second-order aliases. 



282 


Finite groups and completely orthogonal squares and cubes 


Additional note on ‘The design of optimum multifaotobial experiments’ 

Kempthorne (1947) has pointed out that the multifactorial designs of Plackett & Burman 
(1946) are in some cases high-order fractionally replicated factorial designs. 

Thus their case oiN = 16, L = 2, n = 16 is a 2 -u replicate of the 2 18 experiment. The alias 
subgroup of 2 11 elements for their design can be generated by 

ABN, ACK, ADE, AFL, AGJ, AHO, AMP, BCO, BDL, BQM, CDP. 

Altogether there are 35 three-letter elements in the alias subgroup, and each factor has, 
amongst its aliases, 7 first-order interactions involving the other 14 factors. 

Their case of N = 32, L = 2, n = 31 has an alias subgroup of 2 31-8 = 2 M elements con- 
taining 165 three-letter elements. Thus each factor has, amongst its abases, 15 first-order 
interactions involving the other 30 factors. 

The three cases where N — 9, 25 and 49, L — 3, 6 and 7, and n — 4, 6 and 8 respectively 
are completely orthogonal squares with the abas characteristics described above. 

With N = 27, L = 3, n = 13, if we ascribe the letters a to n (omitting i) to the 13 factors 
in the order given, the treatments are a subgroup generated by 

ad 2 f 2 gh 2 j 2 km 2 n 2 , bde 2 fgh 2 klm, cef 2 ghj 2 lmn, 
and the abas subgroup of 3 10 elements is generated by 

BCL 2 , A 2 B 2 K, D 2 EG 2 , EGN, CEL, C 2 K 2 G, B 2 J 2 F, ACJ, AL 2 M, C 2 HK. 

Thus in ab the multifactorial designs amenable to representation as fractional replications, 
ab main effects contain first-order interactions amongst their abases. 

Thanks are due to the Distillers Company Limited for permission to publish this paper. 

REFERENCES 

Finney, D. J. (1945). The fractional replication of factorial arrangements. Ann. Eugen., Lond., 12 , 
291-301. 

Fisher, R. A. & Yates, F. (1948). Statistical Tables for Biological, Agricultural, and Medical Research 
(3rd edition). London: Oliver and Boyd. 

Kempthorne, O. (1947). A simple approach to confounding and fractional replication in factorial 
experiments. Biometrika, 34, 255-72. 

Plackett, R. L. & Burman, J. P. (1946). The design of optimum multifactorial experiments. 
Biometrika, 33, 305-25. 

Stevens, W. L. (1939). The completely orthogonalifeed Latin square. Ann. Eugen., Lond., 9, 82-93. 



[ 283 ] 


SYSTEMATIC SAMPLING OF CONTINUOUS 
PARAMETER POPULATIONS 

A. E. JONES 

[ Editorial note . This paper was submitted by Dr Jones in the autumn of 1947. A number of points 
arising on the original draft were discussed with him, and the Editors were about to consult him on 
further amendments which they thought desirable when Dr Jones met an untimely death in a lift 
accident on 7 May 1948. The amendments liave, nevertheless, been made in the interests of clarity, 
and the more important of then! are indicated in this paper. No alteration has been made in the 
substance of the paper or in the main formulae, but the second part has been omitted and is discussed 
in the succeeding paper by Mr Kendall,] 


Introduction 

1 . A sample is bsually said to be distributed systematically when the individual members 
are selected according to some deterministic rule. For instance, if it were desired to estimate 
the mean of a characteristic of a mass-produced article, and a sample consisting of every 
tenth article were taken, we might say that this was a systematic one. Such a sample may, 
of course, be random if the method of selection of the samples is unconnected, in any con- 
ceivable manner, with the value of the characteristics under consideration. 

Estimates obtained from a systematic sample may sometimes be more accurate than those 
from a random sample of the same size, but it is usually considered that this lower accuracy 
is more than counterbalanced by the fact that a more reliable estimate of the sampling 
variation can be obtained by the latter method. The customary formulae giving an estimate 
of sampling error are based on the assumption that there is no correlation between the 
individual members of the sample. It sometimes happens that neighbouring sample members 
are positively correlated, in which case the error estimated from the customary formulae may 
differ substantially from the true value. 

The problem to be considered here is that of estimating by a systematic sample the 
‘average value 9 of a random variable from a one-dimensional homogeneous population 
depending on a continuous parameter, such as the nitrogen content of soil along a one- 
dimensional strip. The phrase ‘average value’ requires closer definition. Suppose Z(t) be 
used to denote the actual nitrogen content at a fixed point distance t from one end of the 
strip under consideration. Then Z(t) may be regarded as a random variable with a hypo- 
thetical population consisting of all the nitrogen contents at this point throughout a number 
of years. There is an infinity of random variables corresponding to all the points of the 
^-interval considered. We are not interested in the means of these populations, but in the 
value of the nitrogen content of the whole of the strip of soil. Thus, if T 0 and T ( > T 0 ) are the 
ends of the interval, the average value in which we are interested is 

z ~T^rJ To mdt ' 

which is not a parameter of any of the hypothetical populations mentioned above, and, in 
fact, is itself a random variable. 

It will only be possible to make a finite number of observations and if, say, n is the 
maximum number it is proposed to make, we would wish to dispose them in such a way that 



284 Systematic sampling of continuous parameter populations 

the mean-square error of our estimate of 2 will be a minimum. Hence if z(< 1 ), z(t 2 ), ..., z(t n ) 

represent observed variates at t v t 2 , ..., t n respectively, and if our best 1 " estimate of 2 is the 

linear combination ... .. ' , ... 

+ o 2 z(< a ) + . . . + a n z(t n ), 

the mean-square error of the estimate will be 

S = E[a 1 z(t 1 )+a 2 z(t 2 )+ ...+a n z(t n )-2] a , ( 1 - 1 ) 


where the operator E denotes expected value. 

Apart from determining t v t t , we also require the mean-square error of this estimate of 

2. Mathematically, this resolves into two problems: first, that of obtaining the mean-square 
error as a function of the (unknown) parameters of the distribution of Z(t), and secondly 
obtaining a formula in terms of actual sample values which will provide a satisfactory esti- 
mate of this function. 

It will be shown, on reasonable hypotheses, that if the correlation between the successive 
values at the points t v < 2 , ..., t n is very small, the best method of distributing n members 
along a strip of length T is at distances T/(n+ 1), 2 T/(n+ 1), ... from one end, while if the 
correlation is greater than 0-25 the best distribution will be at distances T/2n, 2T/2n, 


Theoretical representation of Z(t) — assumptions 

2. It is necessary to make certain assumptions about the way in which the 
random function Z(t) varies. Physically, we should expect that there will be some smooth 
curve, or trend, about which Z(t) will fluctuate randomly. In addition, we should expect that 
Z(t) wifi be continuous in in other words we should expect that by making h small we could 
make [Z(t + h) — Z(t)] as small as we please. Incidentally, it follows from this that it is 
impossible to regard the random variables Z(t) and Z(t + h) as independent for all values of h. 

Thus any recorded value, z(< 1 ), is the sum of three components : (a) an error of observation; 
(6) the value of the trend at t x \ (c) a random variable, which will be denoted by X(t x ), repre- 
senting Z(t x ) less the value of the trend at t v From the definition of the trend we may take 

EX(t y ) = 0 for all t v (2- 1 ) 

The following assumptions will be made about these three components: 

(A) That the error of observation, i.e. z(t x ) — Z{t x ), is a random variable with zero mean, 
and is uncorrelated with all other errors of observation, and with Z(l) for all t. x(l. x ) will be 
used to denote the sum of X(t x ) and the error of observation at t v In other words, x(t x ) is 
z(t x ) corrected for trend. 

(B) In the first place it will be assumed that the trend is simply a constant for all l. If 

X " f^fJl 0 X(t)dt ’ ' (2 ' 2) 

the problem of minimizing 8 of equation (1*1) then reduces to the problem of minimizing 

E{a 1 x(t l ) + a a x(t t ) + . . . + a n x(t n ) - X } 2 , (2-3) 


* Throughout this paper the expressions ‘best estimate’ or ‘optimum estimate’ will be used to mean 
the linear estimate whose mean-square error is a minimum. [The expression ‘mean-square error’ is 
used instead of ‘variance* because we are considering a minimization of the type E(X — F) 2 , where 
X and F are random variables, not of the type E{X — E(x)} % . — Ed.] 



A. E. Jones 


285 


provided we confine our attention to unbiased estimates, i.e. make the restriction* 

(2*4) 

i-l 

(C) About X(t) it will be assumed that : 

(1) The variance of X(t) = E{X(t)} 2 = A, a constant independent of t. (2«5) 

(2) The correlation of X(s) and X(t) depends only on | s — t |, and decreases exponentially 

as | s-t | increases. X{X(*)X(* + 1)\ will be denoted by A H(t), where R (t) is the correlation 
function of X(t). Thus ity) = e -q\t\ ( q > 0 ). (2-0) 

It follows that for h small 

E{X(t + h)-X(t)} 2 = 2A(l-e-^i*‘) (2*7) 

is also small, and since E{X(t 4- h) — X(£)} = 0, 

it will be seen that by these assumptions we imply that X(i) is continuous in the mean for 
all L t 

I consider that the above assumptions are the simplest that satisfy the required physical 
conditions. (A) represents nothing new. Although it is assumed that the trend is a constant 
for all t y the results are valid if the trend is a linear function of t. As regards the assumption 
(C) it is worth noting, though it will not be discussed here, that there is a ‘small-eiror’ 
justification for (2*5) and (2*6) similar to the usual small-error justification for the assump- 
tion of normality. They are both derived from the Central Limit Theorem. 

I have made some practical attempts to decide whether the above assumptions are 
satisfactory for a number of meteorological variables, e.g. wind speed and air temperature. 
It was found that the observations analysed could be satisfactorily interpreted on these 
hypotheses, but a really thorough examination of their appropriateness would require 
computation far beyond the author’s resources.^ 

Systematic selection of sampling points 

3. Suppose that sample members are selected at points t l9 1 2 , ..., t n in that order, and let 
X i = X(f t ) be the observed variables corrected for trend. Also let x ( denote the corresponding 
observations corrected for trend. Then the expected value of (x { — X f ) will be zero. 

Let E(x i -X i ) 2 = fi. 

Then, expressed mathematically, the problem with which we shall be concerned here is 
that of estimating the random variable, /, where 

/ = (r-ayj* f X(t)dty 

J T 0 

from the random variables, x v x 2 , ..., x n . 

* [Since Ex(t) = 0 the expected value of is zero, not X, The author means that the average 

of ^LiCifXitj) over all t in equal to X which gives his condition that Ecq = 1 . — Ed.] 

f If, in addition, we make the assumption (which is irrelevant to the subsequent development of 
this paper) that X(t) is Gaussian, then X(t) must be a stationary Gaussian-Markoff random function. 
It has been shown that such functions are everywhere continuous. 

% [It was the Editor’s intention to suggest that Dr Jones should include some of this experimental 
verification in his paper. The author assumes that X(t) has the same mean, variance and correlation 
function for all t. His problem is analogous to the estimation of the stationary stochastic component 
of a continuous time-series defined at all points of time as the sum of three components, a linear tfrend, 
a random variable which possesses a correlogram decaying exponentially to zero and a "superposed’ 
error. — E d.] 



286 Systematic sampling of continuous parameter populations 

Now since by 2 (A) and 2 (0) 

E[{x { - X { ) (Xj - X])] - 0 (» *j), 

the correlation matrix of the vector x = [x v x a , . . . , x n ) is given by 

E(x { Xj) = Ae - * 1 ‘*~‘j i (i*jh 
= A +/i (i = j). } 


If 

then 


where 


P t = e-a'i, P 0 « e~* r «, P n+1 = e-« r , 
P(xx') = \R+/iI n , ■ 

1 P,/P a P 3 /P x P n /P/ 

P 8 /P x 1 P 3 /P 8 P n /P 8 

* = I W PJP* 1 PJPz 


(3-1) 

(3-2) 

(3-3) 

(3-4) 


\PjPi P w /P 8 ... 

and I n is the unit matrix.* 

Also, denoting E(XiI) by n it we have that 

n { = E^j X(t)Zidt 
= | T E[X(t)x ( ]dt 

J T, 

A / Pj 

p.)- 


1 


(3*5) 


Theorem 1. Subject to the restriction that the sum of the weights, v it is (T — T 0 ), the best 

n 

estimate of I that can be obtained from any n samples is the sample values x t being 

i- 1 

selected at points t ( (t i > where 


(a) 

(b) 


V* = V* = ... = - 


T-T n 


n 


(3-6) 


-Nft 




/ 2 - ^ ^2 — ... — t n ~ t n . ± 

l-JiT- 
2A n 

,c) = r-<„ . - 

and A; satisfies the equation 

X+fiT- 


(3-7) 




Proof. The mean-square error of v'x as an estimate of 1 will be 

S = P(/-v'.x)* 

= v'(AP+/t/„)v-2v'.n+P(/*). 


(3-9) 


* The matrix notation used in this paper will be that of The Theory of Canonical Matrices by Turnbull 
and Aitken. 



A. E. Jones 


287 


In order that v'x be the best estimate, 8 must be a minimum for t t and all possible v { . Con- 
sidering first the conditions that 8 should be a minimum by differentiating (3-9) partially 
with respect to t it we obtain by (3-3) and (3*5) 


i-1 n 

— = — 2 q v* -f 2 9 v. 

i-i i-i+i 

P, P«+i |‘« l P* * ?A ,• , 

i.e. ~ f = S Vp + £ W (t«l, 

li-i P* j«i+ 1 

These n equations can be combined in the matrix form 

P P* 

i x n+l __ /V\«i _ i ^ \ 


(310) 


(3-11) 


where 


Considering now possible variations of v, the optimum values must satisfy 



(AM+filJv + k' A = n, 

(3-12) 

i.e. 

2 *- P " +1 -S = MG + GOv], (i = 1 n), 

1 i *0 

(3-12') 

where 

i ; = (1, 1, 1) and lc\ k are scalar constants. 


The optimum estimato of 1 which can be made from n samples must thus satisfy (3*11) 

and (3-12'). Hence 

V* 

11 

+ v* 

1 

(3-13) 


(i =1 »)• 

■*o 

(3-14) 

from (3-13) k(P t -P tn ) = ? P i v i + Pi n l ’ i+1 ) = 1 n ~ 

(3-15) 

Similarly, from (3-14) 

HPi-Pi+i) = ?(V p w !, (+ A or p < ,, Hi)' 

(3*16) 

Combining (3*15) and (3*16) 

)(p<+i 2 / Pi 2A~ ) = °’ 


i.e. 

= (* = 1.2 n-1). 

(3-6') 

Writing Vj = v, = .. 

.=?>„ = ( T-T 0 )/n in the equations, we have 


Pi_P* 

P*~P* 


(3-17) 



288 Systematic sampling of continuous parameter populations 

Also we have § - (3-18) 

Pi Pn+x \ 2A n V 

Equations (3'17) and (3*18) lead immediately to results (3-7). 

Finally, since = §§••• = e* T ~ T *>, 

•*n+l *l- f i / n+l 

& must be given by 


(3 ' 8) 

Corollary. It may sometimes happen because the trend is known theoretically, or through 
some other circumstance, that it is unnecessary to impose the restriction 

tf 1 + t> 8 + ••• +r n = T-T 0 . 

In this case we obtain by putting k = 0, k = 1 in the equations (3*13) and (3-14) that the 
optimum estimate of I is given by 

= — = v n = v (say), 


*>0 

p. 




A+/t \ n+1 
2A~ V 




Distribution of sample values — practical application 

4. Two simple conclusions emerge from equations (3*6)-(3*8). They are that the optimum 
estimate of the ‘average value * for a given sample size is one in which the same sample values 
are combined with equal weight 1/n and in which the distances between successive sample 
values are all equal, i.e. t 2 — t x = — 1 2 . . .. The distance of the first sample value from the end, 

t x — T 0 , is governed by a rather more complicated equation, but it will be shown that t x — T 0 
must lie between l(t 2 — t x ) and (J a — J*). It is worth while to note, in passing, that though the 
formulae for t v t 2 , t n have been worked out on the assumption that the trend (§2) is 
simply a constant, the formulae will still be accurate if the trend is a linear function of t* In 
fact no other distribution of sample values is likely to decrease appreciably the error in the 
estimate of mean arising from the possible curvature of the trend. 

It is only necessary to know the ratio (t x — T 0 )/(t 2 ’-t l ) to be able to specify the positions 
t v t 2> .. ., f n completely. 

Theorem 2. If the distribution of sample values which gives the best estimate of the 
mean is at points t v t 2 , . . ., t n (t { > ^_ 1 ), then 

^2 “ ^i+i (i = 2, 3 , ..., n — 1), t x — T 0 ~ T — t n 

* Suppose the trend eould be expressed functionally as A -1- Bt. Then the truth of the above state- 
ment can readily be seen by considering the random function X(t) = Z(t) — A — Bt, and applying the 

formula, p n rT -|t r n rT ~i* r n C T *1* 

E E J Z(t) dt J = E L a i x i — J X(t) d/J — J tdt^ . 



A. E. Jones 


289 


and (a) (ti—T 0 )l(t t —ti) is independent of A and p, 

(b) — tf) ^ — T 0 < < 2 — ^x» (^’1) 

(c) (<! — 2o)/(i 8 — <i) can be determined very closely from the correlation of actual values, 

distance (T — T 0 )/n apart, i.e. from 

e -(HT-T t )ln i (42) 

and is otherwise approximately independent of ». 

Proof of (a). If (T — T 0 ) qjn — r k — fair /A = K, (4-3) 

p = (» = 2, p 0 = e~* t i- T »>. (4-4) 

From (3'15)-(3-18) p 0 = K — Jr, (4*5) 

p = (A-Jr)/(A + Jr), (4-6) 

pip*- 1 = (A - Jr)*+V(A' + Jr)"- 1 = e~* T . (4-7) 


The value of K is given by (4-7) and (f x — T 0 )/(< 2 - tj) can be determined from (4-5) and (4-6). 
They are clearly independent of A and // . 


Proof of (b). We have, from (4-6), 

K-\r = rp!(\-p)=p 0 . 

Hence by (4*7) p n¥l /(l — p) 2 = e" nr /r 2 . 

Now p* -;A ^ 2 iog c i Up = log, i/Vp- 


Hence 


^ 1 

(1 -P) 2 " (logc !/P) 2 


g— nr ^n+1 g-nlog e l/p 

r 2 ” (1-p) 2 * (log e l/p) 2 ‘ 


(4-5') 

(4-8) 


Now e nr jr 2 is a monotone decreasing function of r (r> 0). Hence log, l/p^r, and since 
pip*- 1 = e~ nr , it follows that log, ljp 0 ^ Jr. Hence 


tx-T* 


Also, since 1 —e~ rnKn+1) < r, by (4*7), 


p(l —p 2 )" 2/(n+ D = r -2/(n+l)g-rfi/(n+l) ^ 


e -rn/(n-f 1) 


y > ~ ' ° ^ (1 _ c -rn/(nfl))2/(n+l) # 

Now p(l — ^ 2 )- 2 /(n+i) a monotone increasing function of p (0 <p < 1 ). Hence p ^ er rnl{v + l) . 
But from (4-7) p Q ^ e~ rn/(n+1) . So by (4*4) 




(4-r) 


Thus (4-1') and (4-1') establish (4-1). 

Proof of (c). To prove tliis result, we show that for a given r, (t x — T 0 )/(t 2 — t t ) is a monotone 

function of ». For . m , , 

tx-To -log (A -Jr) , 4 .m 

< 2 -<i — log (A — Jr) + log ( A + Jr) ' 



290 Systematic sampling of continuous parameter populations 

It follows from (4*2) that K + \r > 1. Also A is a monotone funotion of n. Hence, when n 
increases (< x — T 0 )/(f 2 — < x ) decreases. 

Some values of this ratio have been worked out and are presented in Table 1.* For n = 3 
the value is nearly the same as for n large and thus the ratio is nearly independent of n. This 
completes the proof of theorem 2. 


Table 1 (see text) 

T — T 0 = length of the interval sampled; n = number of sample members. L is the distance such 
that the correlation of true values, distance L apart, will be 05, i.e. is given by — 0-5. 



p = correlation 

u — T n 

U — T n 

* J 0 
nL 

between successive 
sample members 

, — r ( w lar ge) 

, :°(n=3) 

0*5 

0*707 

0-61 

0*51 

1*0 

0*500 

0*53 

0*51 

2*0 

0*250 

0*50 

0*54 

3*0 

0*125 

0*68 

0*50 

4*0 

0*002 

0*01 

0*58 

5*0 

0*031 

0*03 

0*00 

7*0 

0*008 

0*07 

0*05 

10*0 

0*001 

0*72 

0*09 

15*0 

0*0 4 3 

0*77 

0-74 

20*0 

0-0*1 

0*82 

0*78 


We may expand the ratio (t t — T 0 )/(l 2 — < x ) in terms of r. It will be sufficient to consider the 
expansion when n is large as, by theorem 2, the result is nearly independent of n. We have 

h ~3 = log { pr/(l -p)} = r+log{r/(l- p)| 
h-h log p r 

“S + 5 + 0(f,) * (4-10) 


A similar expansion in terms of the parameter 1 — e r , which is nearly equal to 1 — p, is 

(4-11) 


1 1 —e~ r ( 1 — e~ r ) 2 ... 

.+ -— — + J + 0(1 


24 


48 


We cannot expect to have anything but a rough idea of r before an experiment is conducted. 
Fortunately, this is all that will be required. It will be seen from Table 1 that if e - * > 0 25 
(r. >2*8), no accuracy will be lost if the sample members are placed at distances ( T — T 0 )/2n, 
3(T —T 0 )/2n, etc., from T 0 . In other words, if the correlation between successive sample 
members is likely to be appreciable, we divide the whole interval into n equal sections, and 
place one member at each of the mid-points of the sections. Again, if the number of sample 
members is large, then the loss of accuracy resulting from placing the samples at distances 
(T— T 0 )/2n, 3 (T — T 0 )j2n from T 0 should be small. 


* [Editorial note. Dr Jones appears to have derived the last column of Table 1, first by determining 
r by successive approximation from p according to the relation e- Br /r , — p" +1 /{ 1 — p)*, and then by the 

use of (<i - r 0 )/(«» - <i) = log {pr/( 1 -p )}/ log p. 

The approximation of (4*10) is not very useful for small p because r is then large. It may be shown that . 
for large n the ratio (*i — T 0 )/(^~ tends to l-{log (l-p)~ log | logp |}/lo gp which obviates the 
necessity for determining r.] 




M. G. Kendall 


291 


CONTINUATION OF DR JONES’S PAPER 


By M. G. KENDALL 


1 . In his original version of the foregoing paper Dr J ones included a Part II on the estima- 
tion of mean-square error. His treatment was open to misunderstanding and his methods of 
estimating constants unnecessarily elaborate. The most unfortunate accident of his death 
prevented a discussion of these matters with him. The editors felt that his work was im- 
portant enough to justify publication and the foregoing paper appears accordingly. In this 
continuation I have re-examined the work in his Second Part and have added some comments 
of my own. It was, the Editors felt, scarcely within their province to substitute this work 
for that of Dr Jones, and to publish it under his name, although some of the methods are due 
to him. 


2. In the first place I wish to comment on Dr Jones’s theorem 2 and its consequences. He 
has shown in his theorem 1 that the optimum distribution of sampling points is obtained 
when they are equidistant, and essentially his second theorem is concerned with the relative 
length of the two end-segments of the range of t as divided by the sampling points. 

Write a = 7 (1) 

Then since p = and p 0 = e-*h- T o\ a = logp 0 /logp. From Dr Jones’s equation 

(4*7) we have plp n ~ x = e~ nr and thus it follows that 



Theorem 2 shows that \ ^ a ^ 1 . (3) 

Thus, as n becomes larger 1 4* r/\ogp must tend to zero, or p tends to c“ r . Furthermore, since 
r = (T — T 0 )qln, r must tend to zero (and p accordingly to unity) as n tends to infinity, 
provided that (T~~T 0 )q vermins constant . Since q is an (unknown) positive constant of the 
system, this is equivalent to the proviso that T — T 0 must remain constant. 

3. I think Dr Jones’s Table 1 tends to obscure this point. If we keep the interval T~T 0 
constant, then as n increases p tends to unity, as we should expect from considerations of 
continuity. Per contra f for large n the interval increases as p becomes smaller. The increase 
in n for fixed p then does not correspond to a denser distribution of sample points but to the 
extension of the sampled strip. In such a case it seems an unnecessary refinement to discuss 
at great length the end intervals when the total interval is tending to infinity. Dr Jones may 
have been thinking here, not of his example of a strip of soil, but of meteorological variation 
which goes on indefinitely. 

4. Let (T~T 0 )q = d. Since p tends to unity with increasing n for fixed d we may write 




i-p=~+ 

On substituting in 
and identifying coefficients we find 
«, = <*, o 2 = - $d 2 , 


, a a , a * , 

— i H — i H — • + . 


n n* n* n ( 
e~ nr (l -p) 2 = r 2 p n ^ 


a. 


(2d + 3)d» 
~d + 2 12’ 


a. = - 


d 5 

240 + 2)' 


( 4 ) 

(5) 

( 6 ) 



292 Systematic sampling of continuous parameter populations 

1 a? 


Hence we find 


1 + 


r^=;‘ + 3^ +0 '"-» 


Thus to order n _1 


1 1 d* 

a “ 2 + 12n ’ 


( 7 ) 

(8) 


and so, as n tends to infinity a tends to \ in all eases where the interval d is fixed. It appears 
to me, therefore, that the general rule for large samples is to divide the interval into n equal 
parts and place one sample member at the middle of each. The discussion for Bmall p is only 
appropriate when the sample is large and the interval is also large, in which case the location 
of the first sample member can scarcely be of much practical interest. 


5. If we keep p fixed and let n tend to infinity, suppose that 





( 9 ) 


Such an expansion is possible because, from (2), l -f r/logp must tend to zero with increasing 
n in all cases. On substitution in (5) we find to order 


p 1+6 i(logy ) 2 / 1 26, \ 

(l-p)» \ n) 


l ~n l ° S V ' 


whence 

= l+ log ^{log(l p) log | log p\], 

(10) 


K = 2b 1 /\ogp. 

(11) 

Thus 


(12) 


It happens to be true that as p tends to unity this value of a tends to 


6. It may be noted in passing that expansion by these methods (d fixed, n large) leads 
to the series , i 

(,3 > 


which is not the same as Dr Jones’s equation (4*10) unless we let d become large. 

7. If the trend is constant or linear the mean-square error of our estimate for Z is the 
same as that for X. We then require 

e(- Sx - Z Y* = \ E(Zx)* - - E(ZxX) + E\X )* . (14) 

\n ) n* n 

Now E(X*) = - T - 1 y- )2 J* E{X(t) X(u)} dtdu 

i fT rr 

2A 2A(l-e-^ r - T »)) 

“ iT-T 0 )q ( T-T 0 )*q* 


( 15 ) 



M. G. Kendall 


293 


Using Dr Jones’s (3-5) we have 

2 -m*X) = ~z^v-p 0 p ) - l -PoP n - i ) 

n n j q 

rn 

nq\ l-p ) 

4A { , P( l ~P n ) i(T~3}\ 

l T-T 0 )q\ n 2 {\ -p? I 

E {~^ = + (i+j) 


(16) 


= /# + A{l +p+p 2 +-...+p nl +p+l+p + ...+p n - 2 +...+p n - 1 +p n - 2 +... + l} 
Tt 


= /* M l +P) _ 2Ap( 1 —p n ) 
n n(l—p) n 2 (l—p) 2 

Hence, on substitution in (9), we find 




_1 +p 
,tc(1 -p) 


q(T-T o yi 


, „ J />( 1 - p n ) _ ir e ' MT Z Ta) \ 

+ ^\n 2 (\-p) 2 (r-7j,)vr 


Since 

this simplifies slightly to 


pti+l e -nr e -q{T-T Q ) 

rt?(l -p) 2 “ n*r 2 = ~{T-%)q 2 ' 


« /* J *+jp _ 2 U— f _JL __ ” 2 

n |»(i'^) «(r-T,)| + » 2 l(i-J») 2 q 2 (T — T 0 ) 2 


(17) 


(18) 


(19) 


8. Dr Jones, having reached this expression by a slightly different route, proceeded to 
argue that the last term was of order n~ 2 and could be neglected, whereas the second term 
was of order n - 1 . Noting that q(T — T 0 ) was of order n log 1 jp he was led to the expression, 
for large n, I 

S = ^{p + mp)}. (20) 


where 


F(p) = 


*+P | 2 

l-p logp’ 


( 21 ) 


He gave the attached Table 2 for F(p) and emphasized that even for low values of p such 
as 0-01 the mean -square error (apart from the term in p) was only half of the value for p = 0. 
He inferred that even a trace of correlation between successive sample members would 
seriously affect the mean-square error. 


9. Now if we hold p constant and let n tend to infinity, formulae (20) and (21 ) result. But 
in doing so, as I have already pointed out, we are extending the interval to infinity, not con- 
centrating the sample more densely. In these circumstances it is not so surprising that small 

Biometrika 35 


19 



294 Systematic sampling of continuous parameter populations 

correlations should affect the mean-square error substantially. It appears to me to be more 
relevant to the problem to consider the limiting form of (19) when n tends to infinity but 
d = q(T — T 0 ) remains fixed. In such a oase we get quite different results. 

10. Consider the first term in braces on the right-hand side of (19). 

Substituting from (4) we find 

l+p 2 - /o gx °8 a 3 \ll„ 

n(l-p) d~\ n n a n 3 ")f\ 1 + n « a + "7 d ’ 


which, to order n~ 2 , reduces to , 

C(a + Z) nr 

(22) 

which is of order n~ 2 , not n -1 . 


Similarly for the last expression in (19) 


pi d 

n 2 (l—p) 2 d 2 12(d-f2)n a * 

(23) 

To order 1 we then have merely 8 = ^ , 

(24) 

or, to order n ~ 2 , 8 = - -f ™ • 

n 6n 2 

(25) 


Table 2 (see text) 


P 

F ( p ) 

P 

F ( p ) 

P 

np ) 

0*0 

1-000 

0-06 

0-417 

0-35 

0-172 

00*1 

0-855 

0-07 

0-398 

0-40 

0-151 

0-001 

0-712 

0-08 

0-382 

0-45 

0-137 

0-005 

0-633 

0*09 

0-367 

0-50 

0-115 

0-01 

0-586 

0-10 

0-354 

0-60 

0-085 

0-02 

0-530 

0-15 

0-299 

0-70 

0-059 

0-03 

0-491 

0-20 

0-257 

0-80 

0-037 

0-04 

0-462 

0-25 

0-224 

0-90 

0-018 

0-05 

0-438 

0-30 

0-196 

3-00 

0-000 


1 1 . The appearance of the term in fi is to be expected ; it represents the variance of super- 
posed error of observation. But the result that the remaining part- of the mean-square error 
is of order n ~ 2 is at first sight surprising. One is so accustomed in statistical work to a sampling 
variance of order nr 1 that anything of lower order requires some explanation. 

It is here that we must remember that our problem is not the determination of the mean- 
square error or the sampling variance of n independent observations. On the contrary, 
under our assumptions, the correlation between neighbouring sample members tends to 
unity with increasing n. The variation of such members among themselves is of order ra~ a 
as may be verified from (17), neglecting the term in fi. Since we choose our function X so 
as to fit these observations as closely as possible it is not, after all, surprising that the average 


difference square of ^2(a:) and X 


is aiso oi oraer 



M. G. Kendall 


295 


1 2. There is one essential discontinuity in the situation, however, which is worth noticing. 
If q(T-T 0 ) remains finite then if n tends to infinity, p must tend to unity and the series 
(apart from error of observation) is continuous. However large q may be this is true. But 
if q is itself infinite the series is not necessarily continuous, the observations are independent 
and we revert to.the usual situation of n independent observations. Thus for any q, however 
large, (25) is correct; but for q infinite it is not. Looking back to (19) we see that (since p in 
this case is zero) 


which is what we should expect. 


s-V, 

n n 


13. From (19) we see that (without approximation) the mean-square error 8 depends on 
/i , A and p\ on the known quantities T — T 0 and n ; and on q , which is known when p is known 
and the interval between successive observations is known. In practice we may require to 
estimate //, A andp. In the limiting case which Dr Jones considered, leading to (20) we also 
require fi 9 A and p. In the limiting case which I am considering, leading to (25), we require 
only fi and A. 

Dr Jones proceeded by dismissing /i and considering the estimation of A and p from the 
expectations of powers of differences S(^ — x^) 2 . This is really equivalent to using the serial 
correlations of the observations. 

We have, since E(x) = 0, 

E(x 2 ) = p + A, E(xjx jn ) = A p, E(xjX j+2 ) = Ap a , etc. 

If we take the observed serial covariances of the observations (freed from trend) as esti- 
mators of the corresponding expectations we then have, if the serial correlations are r v r 2 , etc. 


V = 


i r i 

A = - - vara:, 


/ 1 = varx| 


(-*)• 


(26) 

(27) 

(28) 


These seem to me to be the simplest equations of estimation whioh one is likely to find. If 
we may assume that /i = 0 we have the more reliable forms 

V = r lt (99) 

A = vara:. (30) 

These equations are the usual ones for estimating constants in a Markoff series. 

14. One final comment. The relative simplicity of Dr Jones’s result that the optimum 
distribution of sample points is equidistantly along the interval is a little deceptive. It is 
natural to consider the more general problem when the autocorrelations along the series are 
not necessarily decaying according to an exponential law, but have any form permissible 
for a continuous random process. An intuitive approach might suggest that the weights and 
distances of the observations should be equal because there is no obvious reason to the con- 
trary; but any conclusion of this kind would be quite wrong. If we denote the autocorrelation 


19-2 



296 


Systematic sampling of continuous parameter populations 


function of the series by p{t) and the sample points are t 1 ...t n in that order the general 
equations corresponding to Dr Jones’s (3-11) and (3-12) are 


^ 1 v i + v iP{ti — ti) + • • • + v n p{t n — tj) — E{x(t 2 ) X}, 

V 1 P(t 2 — ^l) + 1 ^ j + • • • + v n P(^n ~~ ~ E{x(t 2 ) X}, 


v x p{t n - h) + v 2 p(t n -*,)+. .. + r B |l+^J = E{x(t n ) X}, 

0 + v 2 p'(t i -t 1 ) + ...+v n p'(t n -t 1 ) = E'faitJX},' 

- Vl p'(t t - <i) + 0 + . . . + V n p'(t n - t 2 ) = E'{x(t 2 )X}, 

- VlP'(*n -h) + v*p'(t« -<*)+•••+ 0 = E'{x(t n ) X},J 


(32) 


where the primes denote differentiation. I cannot see any general tractable solution to these 
equations. Consideration of some particular cases suggests that general solutions would be 
rather involved. For instance, if the autocorrelation function is a sine curve (corresponding 
to sinusoidal periodicity in the original series) a set of equidistant observations might be 
the worst possible if the distance between them was equal to the period of the system. Again, 
if the autocorrelation function decays to zero in distance l and the observations are so sparse 
as to be farther than l apart they are independent and their position is indeterminate within 
limits. Further research on this subject is needed. 



[ 297 ] 


THE PRECISION OF OBSERVED VALUES 
OF SMALL FREQUENCIES 

By J. B. S. HALDANE, F.R.S. 

In recent genetical work numerous observers have recorded the frequencies of rare events, 
notably mutations. It has been realized that it is misleading to state the observed frequencies 
with their standard errors, since the distribution is decidedly skew. Various devices have been 
suggested to avoid this difficulty. But so far as I know it has not been pointed out that, when 
the frequency is small, its cube root is almost normally distributed. This will be proved and 
applied to actual observations. 

Let a rare event be observed in a out of n trials, where n is much greater than a 2 . Let x be 
the true value of the frequency, whose observed value is p = ajn. Let the a priori distribution 

oficbe dF = <j>{x) dx. 

Let the probability distribution, after the observation has been made, be 

dF = f(x)dx , 

and let x = y 3 . 

Then for given values of n and x , the probability of a is 

xP{\-x) n ~ a . 

Hence for given values of n and a, the distribution of x is 

3^(1 — x) n ~ a (f>(x)dx 


dF — f(x)dx = 


J x a (l — x) n ~ a <j>(x)dx 
If we assume that all values of x are equiprobable, <j>(x) = 1, and 
= - (w tilL fV+»(l -x)"- a dx = — i. 

^-a)!j 0 w + 2 


a ! (n - 

This value should of course be ajn. As 1 have previously remarked (Haldane, 1932) and 
as Jeffreys (1948) has shown in greater detail, the assumption that <j>(x) = 1 introduces 
a bias. It is also contrary to common sense. If we are trying to estimate a mutation rate, 
we know a priori that it will almost certainly be less than 10 3 and greater than 10~ 20 . In 
a particular case we might perhaps guess that such a rate would be about as likely to lie 
between 10~ 6 and 10 ~ 6 as between 10~ 6 and 10" 7 . In other words, when x is small it is more 
nearly true that all values of log x are equiprobable than that all values of x are equiprobable. 
This would imply that <j>(x) = cjx in the region considered. However, this cannot continue 
to be true when x is sufficiently small. If we wished to state a plausible general form for the 
a priori distribution of x it might be somewhat as follows : 

F a® k (x = 0), 

(z + e)(l+e-x) v 
F = k (x = 1), 



298 The precision of observed values of small frequencies 

where k is some number less than J expressing the possibility that x may prove to be zero 


or unity, 


C = 


(l-2fc)(l + 2e) 


21og(l+e~ 1 ) ’ 

and e is a very Bmall number, perhaps of the order of 10~ 100 , expressing the fact that exceed- 
ingly rare events are relatively infrequent. If the universe is finite in space and in time, and 
if there is a minimum time in which an event can occur, it might imply that there is no sense 
in discussing events which have no appreciable probability of ever occurring. 

For practical purposes, however, so long as we know that a exceeds zero, and is less than 
n, that is to say, that the event considered is possible and so is its converse, we can take 

Q 

<f>(x) = — — — without appreciable error. We then have 

(»-!)! 


dF 


x“ -1 ( 1 — a;)” - " -1 dx. 


( 1 ) 


(a— 1)! (n — a— 1)! 

This is a Pearsonian Type I distribution, and 

- (»-l)l(a + r-l)i 

(n + r— 1)! (a— 1)!' 

Thus x = a/n, as it should be, x 2 = otc - When an-* is small, this approximates 


n(n + 1 ) 

very closely to the Type III distribution 


g-nx^-I 

dF = ~rrdx. 


(2) 


(tt-1)! 

Now Wilson & Hilferty (1931) showed that the cube root of x 1 is almost normally dis- 
tributed; and the same transformation will almost normalize many Type III distributions. 
The standard form of this type, referred to its mode, is 

-Hr 

It is more convenient to change the origin to the point where the probability becomes 


dF 


e~y*dx. 


zero, and write 


dF = 


Y'xf 1 dx 


r(c)er* ’ 

where c = 1 +p = 1 + ya — 4 jfi x . 

K r = (r — 1)! cy -r , so the mean is cy _1 and the moments about it are 
fi 2 — cy -a . Ft — (3c a + 6c)y -4 , /i 6 = (15c 3 + 130c 2 + 120c) y -8 , y 3 = (105c 4 + ...)y~ R , 

p 3 = 2cy- 8 , = (20c 2 + 24c) y~ 6 , /i 7 = (210c 3 + 924c 2 + 720c) y~ 7 , p 9 = (2520c 4 + . . . ) y~ 9 . 

Let x = cy 1 + z, so that z r = p r and y = (yx/c)*. Then 

H ,+ 5)‘ 


and 


yr= 1+ ^ r ^ + 




l | r(r— 3) [ r(r— 1) (r — 3) (r- 


18c 


2(18c) a 


JB) rV - 3) 2 (r - 6) (r -9) 


6(1 8c) 3 


r(r — 3) (r — 6) (r — 9) (r — 12) (Sr 3 — 30r 2 + 15r 4- 1 8) /( , 

+ 120(18c) 4 +U(C 1 



299. 


(3) 

" r ,r -(S)i[ 1 -0(Sji-(4 + O H' 

’■■-(4[ i+ s +o H- 

r »*^[ I + S +0(c ~ !) ]' 

r *"(4 i + 0<c '' ) ' 

KK 

7i= _ c . + 0 (c-). 

Thus provided 9c, or 9(1 +]>), is large, the approximation to normality, up to the sixth 
moment, is satisfactory. But it is of no value when p is negative, that is to say, the curve 
is J-shaped. (Here p is of course the parameter used in specifying Type III distributions, 
and not the observed frequency value.) 

To apply these formulae to the distribution of y, we have only to put a — c, and to multiply 
K r by p ir . We thus find 

A* =• ^(a- 4 )], v 4 = p‘[rls«^ 3 + <,> (« _4 )]. etc. J ; 

Thus <r = p*/3a*, y, = 2 ya *, y 2 = §a -1 , all approximately. 

The terms involving a ~ 3 in the mean and standard error may bo safely neglected in prac- 
tice. Even when a — 1, the former is only 0-013 of the standard error. If we take (1) as our 
distribution of x, a term of order n 1 must be added to those of order a s . This also can be 
safely neglected. 

Thus we find that y is almost normally distributed with mean (1 — and standard 

deviation p*/3ah For example, if n = 1000, a = 8, p = 0-008, x is by no means normally 
distributed about 0-008, for ji x = 0-5 and /? 2 = 3-75. But y is very nearly normally distributed 


J. B. S. Haldane 


Or, putting t ■■ 


9c’ 


y = \ — t + + 4^ i {* + 0(t 6 ), 

y 2 =l-f + < 2 + + 

y 3 = 1. 

. y* - 1 + 2t - 3< 2 + ^<3 + + 0(f), 

«/~ B = l + 5t-W + + ig-t* + 0(t 5 ) , 

y*=l + 9t. 

Hence the cumulants of the distribution of y are 

k x = l-t+^+^+0(fi)A 
A*2 = t-l£P-10tl+O(t 5 ), 

k 3 = 4< 8 +16 t* + 0(t 5 ), 

k 4 = - 2< 3 -16 t*+0(1*), 

8<*+0(< 6 ), 


Kn = 


Ko = 


-ss^+o^ 5 ),) 



300 The precision of observed values of small frequencies 

about 0-2 x or 0*1972 with standard deviation or 0*0083, with — 0*00004, and 
/2 t = 3*028. The method of Haldane (1938) would give an even better fit if a > 10. 

Two examples will be given showing how the method can be actually used. 

Muller (1928, p. 311) found 13 lethal genes in 1034 X-chromosomes of flies kept at 27° C., 
and 5 in 840 X-chromosomes of flies kept at 19*6° C. Thus corrected values of y l and y 2 are : — 

y i = = °' 23064 1 °' 02160 ’ y * = °' 17720 1 °‘ 02702 - 

y'l — y'i — 0*05334, which is 1*55 times its standard error of 0*03452. The difference is 
therefore rather more significant than Muller, who used the usual formula, believed. 

Again Muller (1940) obtained 7 translocations in 3366 flies with a dose of 375r., and 56 in 
2223 flies with a dose of 1600 r. The question at issue was as follows: ‘the frequency may be 
proportional to the dosage, to its fth power or to its square. With which, if any, of these 
hypotheses are the observed results consistent? ’ 

y' x = 0*125616 ± 0*016081, y' 2 ~ 0*29256 ± 0*01306. 

We therefore compare y 2 with 

2tyi = 0*19940 ± 0*02559, 2 y[ = 0*25123 ± 0*03216, 2% = 0*31653 ± 0*04052. 

The differences are respectively 3*25, 1*19 and 0*56 times their standard errors, so either 
of the latter two hypotheses is admissible. 

It is perhaps worth remarking that, if the emendation of the classical inverse pro- 
bability distribution be rejected, and the calculation made according to Bayes’s hypothesis, 
the cube root of the frequency is still almost normally distributed. It is also true that if the 
frequency of a rare event is estimated by the method described by Haldane (1945) when the 
observations cease when a fixed number tn of rare events have occurred, the estimated 
frequency being (m— l)/(n— 1), where n is the total number of observations, the cube root 
of the estimate is almost normally distributed. Here too the cube root may be used with 
advantage in comparing different estimates. 

I have to thank Prof. E. S. Pearson for valuable criticism. 

Summary 

When an event is rare, the distribution of the cube root of the frequency round the cube 
root of the estimate is much more nearly normal than the distribution of the true frequency 
round the estimate. 


REFERENCES 

Haldane, J. B. S. (1932). A note on inverse probability. Proc. Camb. Phil. Soc. 28, 55-61. 
Haldane, J. B. S. (1938). The approximate normalization of a class of frequency distributions. 
Biometrika, 29, 392-404. 

Haldane, J. B. S. (1945). On a method of estimating frequencies. Biometrika, 33, 222-5. 

Jeffreys, H. (1948). Theory of Probability. Oxford University Press. 

Muller, H. J. (1928). The measurement of gene mutation rate in Drosophila, its high variability and 
its dependence on temperature. Genetics, 13, 274-367. 

Muller, H. J. (1940). An analysis of the process of structural change in chromosomes of Drosophila. 
J. Genet, 40, 1-66. 

Wilson, E. B. & Hilfebty, M. M. (1931). The distribution of Chi-square. Proc. Nat. Acad. Sci., 
Wash., 17, 684-88. 



[ 301 ] 


NOTE ON PROFESSOR HALDANE’S PAPER REGARDING 
THE TREATMENT OF RARE EVENTS 


By E. S. PEARSON 


In the preceding paper Prof. Haldane has suggested a method of handling certain problems 
involving the occurrence of rare ovents, by introducing the cube-root transformation. His 
method of attack involves the use of the concept of inverse probability, so that his final, 
closely normal distribution is the posterior probability distribution of y , the cube root of 
the unknown probability x of the occurrence of the event. From this point of view x, and 
therefore y, are continuous variables. If we attack the problem without an appeal to inverse 
probability, we are concerned with the probability distribution of a, the observed sample 
frequency, for a given x. Since a is a discontinuous random variable which, in the problems 
considered, may only assume the first few integer values 0,1,2,..., the cube-root trans- 
formation would here clearly introduce some awkward problems of discontinuity. It seems 
of interest to consider how the examples which Haldane gives could be dealt with by the 
direct method without the need of substituting approximate standard errors or, indeed, 
of assuming that a very skew distribution is normal. 

In both Haldane’s examples we are concerned with a comparison of the results of two 
experiments. In a first sample of size n x an event occurs on a x occasions, the chance of occur- 
rence being x,; similarly for the second sample we have a 2 , n 2 and x 2 . We then ask whether 
the results are consistent with the hypothesis that x x = kx 2 , where, for the first example 
k - 1 and for the second has to be taken successively as ( J) 8 and (|) 2 . 

The assumption which 1 shall make is that in the problems considered x and 1 fn are suffi- 
ciently small to justify the use of the Poisson series in place of the binomial expansion 
( 1 — x -h x) n . If this is the case and we write 

nii — n iXi (i 1,2), 

the chance of the observed event may be written 


p(a v a 2 \m v m 2 ) = e 




e (a x + a 2 )! 


a x \a 2 \ (rq-fa. 


-- x . -r A°i(i- 


Ojlajj! 


•A) a », 


(1) 


where //= »!,+»»., A = — — . (2) 

m l + tn i j 

For a fixed value of r = a 1 + a 2 , t.he relative distribution ofa x follows a binomial, (1 — A + A) r , 

where A is specified by hypothesis. Thus if the hypothesis is that m 1 = m 8) then A — i, 

but in general _ r „ 

A = = n \ K _ ; (3) 
w 1 x 1 -fn 2 x 2 n x K + n 2 

where k is the ratio of the true or hypothetical chances, x 2 /x 2 . 

As in the case of the more general problem of the analysis of 2 x 2 tables, where the con- 
ditional distribution is a hypergeometric series, there may be some difference of opinion on 
the way in which the result is used. 

( 1 ) We may consider that the whole answer lies in the conditional distribution of a x (or a 2 ) 
for fixed /*, in the sense that we need only ask whether the partition of the observed events 



302 Note on Prof. Haldane's paper regarding the treatment of rare events 

into a y occurring in the first sample and a 2 in the second is consistent with our hypothesis 
as to k. The answer is obtained in terms of probability by summing the tail terms of the 
binomial (1 — A + A) r . This approach appears a natural one to take when two treatments 
have been randomly assigned among n t + ra a individuals. 

(2) We may wish to set the observed result against the two-dimensioned distribution of a x 
and a s obtainable in unrestricted sampling, i.e. without the condition that r = o x + o a is 
fixed. This distribution depends on the value of p — m 1 + m 2 which is not specified by the 
hypothesis, but it is possible to obtain upper limits to significance levels; this problem was 
considered by Przyborowski & Wilenski (1939) for the case A = and further tables for 
some other special cases were circulated during the war within the Ministry of Supply 
(Barnard, 1944; AUinson, 1944). 

(3) Still avoiding the condition that r is fixed, we may note that for unrestricted sampling 

a x -r A ... 

U ~J{rA( 1-A)j (4) 

is a random variable which, on the hypothesis tested, has a zero expectation and unit 
variance. Its sampling distribution is composite, being the sum of a number of binomials 
combined with weights depending on the unknown value of p = m l + m 2 . However, if A 
is not too different from this distribution will not be very far from the normal even when 
dealing with small frequencies. A calculation of the ratio u will therefore often provide the 
broad answer needed in practice. 

These points are illustrated on Haldane’s examples. 

Example 1. a x =* 13, n x = 1034; a 2 = 5, n 2 = 840; r = 18. In this problem, the hypothesis 
tested is that x 1 = x 2i so that k = 1 and A = njfa + rii) = 0*5518. On the assumption that 
the flies were randomly divided into the two temperature groups, the null hypothesis is 
that 18 out of the 1874 would have produced progeny of a type which showed they carried 
a lethal gene at whichever temperature they were kept. Using the binomial approxi- 
mation to the hypergeometric, a 1 would assume values of 0, 1, 2, ..., 18 with probabilities 
given by the expansion of (0*4482 + 0*551 8) 18 . The chance that ^^13 is given by the 
Incomplete Beta Function Ratio 

*•-«!+ 1) = 45518(13,6) = 0*1107. 

An approximation to this chance can be obtained from the integral under the normal curve 
having Mean = rA = 9-932, s.d. = y{rA(l — A)} = 2-110, 

using the correction for continuity. We then get the ratio (12-5 — 9-932)/2-110 = 1-217, 
corresponding to a chance of 0- 1 1 1 8. 

If we take the approach of (3) above, avoiding restriction to the conditional set r = 18, 
then a correction for continuity is not appropriate and we find 

u = (13 — 9-932)/2-110 = 1-45, 

a ratio to be compared with Haldane’s 1 • 65 , obtained from the posterior distri bution of x} — xj. 

Example 2. Here o, = 7, n x = 3366; o a = 66, n a = 2223; r = 63. Three hypotheses are 
examined, (a), (b) and (c), and as two of theso involve the assumption that the chance x of 
translocation varies with dose, the randomization approach to the conditional distribution is 
less clear.* I think I should here base my conclusions on the values of the ratio u. Relevant 


* In this case, with a t = 66, we are beyond the range of the Tables of the Incomplete Beta Function. 



E. S. Pearson 


303 


figures are given below, the means andstandard deviations being for the binomials (1 - A + A)* 8 . 
It will be seen that the values of (oq - rA)/^/{rA(l — A)} are not very different from Haldane’s 
ratios, and that the same conclusions would be reached using either method of approach. 


Hypothesis 

(o) K = i, A = 0-2740 
(6) K=i, A = 0-1691 
(c) K= fa, A = 0-0865 


Mean a^r 8.D. of a x \r u — (a^meanJ/s.D. Haldane’s ratio 


17-300 

10-020 

5-447 


3-542 

2-904 

2-231 


-2-91 

-1-04 

0-70 


-3-24 

-1-19 

0-50 


To sum up: 

(1) if conditions justify us in using the Poisson series to represent the binomial 
distribution of a, given x, which will generally be the case with ‘rare’ events, 

(2) if we are content to base our answer on the conditional distribution of a l for fixed 
r = a 1 + a t , 

the direct method of attack in this two-sample problem provides a test involving a binomial 
distribution. The exact answer in terms of a significance level requires the calculation of the 
sum of tail terms of the binomial, which can be obtained directly from the Tables of the 
Incomplete Beta Function if both a x and a 2 + 1 ^ 50. Alternatively, if we do not wish to 
restrict variation to the conditional set, we may obtain a less precise answer by referring 
u — (a l — rA)j^{rA(l — A)} to the normal integral, a procedure which will be satisfactory 
if A is not too far from 0-5. 

Haldane’s solution with its continuous normal distribution is obviously attractive, but 
it has involved the introduction of the theory of inverse probability. 


REFERENCES 

Allinson, V. A. (1944). Technical Report Q.C./R/21 (Ministry of Supply). 
Baknakd, G. A. (1944). Technical Report Q.C./R/18 (Ministry of Supply). 
Haldane, J. 13. S. (1948). Biometrika, 35, 297. 

PazYBoaowsKX, J. & Wilenski, H. (1939). Biometrika, 31, 313. 



[ 304 ] 


A FURTHER NOTE ON THE MEAN DEVIATION 


By H. J. GODWIN, University College of Swansea 


1. Introduction . In a previous paper (1945), I obtained the distribution of the estimate of 
mean deviation obtained from samples from a normal population; the method of derivation 
was an algebraic transformation of the sample space. In the present note I explain the 
geometrical significance of the result, and obtain a method of computing the moments of the 
distribution. Finally, I discuss various approximations to the distribution. 


2. Geometrical 'prologue. The fundamental geometrical entity which will occur here is the 
regular simplex in k dimensions — a figure whose vertices are (k+ 1) points each equidistant 
from the remainder. (Particular cases are the equilateral triangle and the regular tetra- 
hedron.) These points lie on a hypersphere whose centre is the centroid of the simplex. Any 
k of the points form a regular simplex of (k— 1) dimensions; the join of the centroid of this 
to the centroid of the whole passes through the remaining vertex and is perpendicular to the 
space of the ( k — l)-dimensional simplex. Since the centroid divides the perpendicular from 
a vertex to the opposite face in the ratio k : 1 , the angle made with each other by the (equally 
inclined) lines from the centroid to the vertices is arc cos ( — l/k). If the length of side of the 


simplex is a, then each vertex is at a distance a ^2^+Tj) 
able choice of axes through the centroid, the vertices are 


from the centroid and, by a suit- 


^J{2k(k + 1)}’ <J{2(k-l)k}' 


V{2(£w+l)(£-r+2)}’ 
k-r 


a J(w^TT))’ °'° 0 *>• (l) 


[2(k — r + 1 ) 

The vertices lie in sets of k on the {k + 1 ) bounding hyperplanes 

2 


h 


Jiiclk+l))*' J[kk- 1 ))* ! - J{ (k+i-r)W+2-r)) 

T-° *>• < 2 ’ 


The defining relations for the interior of the simplex are that the left-hand sides of (2) should 
be not less than zero. 

We now find the value of the integral of exp{ — \{x\ + x\ + ... 4-#!)} taken through the 
interior of the simplex. Let this be E(k,a). By joining the centroid of the simplex to the 
edges of any one of the (£— l)-dimensional simplexes in the bounding hyperplanes we 
obtain (k- f 1) equal regions and the integral of exp{ — \{x\ + &§ + ... + x|)} through such a 
region is E(k, a)/(k + 1). A section of a region by a hyperplane parallel to the base hyperplane 
and at a perpendicular distance x from the centroid is a simplex of side x yj{2k{k + 1 )}. Hence 


E(k, a) 

T+ 




o/V{2«fc+l)} 


E[k — 1 , x <J{2k(k + 1 )}] e~ |x * dx. 


It follows, by induction, that E(k, a) — <J(k + 1 ) G k (a/^2), (3) 

where the function G k is as defined in my 1945 paper. [This is, perhaps, the appropriate place 
in which to remark that this function is related to a function defined by McKay (1935) in 
dealing with the distribution of deviations from the greatest observation in a sample; in fact, 



H. J. Godwin 


305 


his F n {x) = O n _ l (nx)l(2n)^ n ^ 1 \ I am indebted to Dr H. 0. Hartley for bringing this to 

my notice.] 

From (3) it follows that O k { oo) = ^ E{k, oo) = . 

3. Distribution of the mean deviation . Let a sample of n from the normal population 
(supposed, without loss of generality, to have zero mean and unit variance) be x v x 2 , 

Let the mean of the sample be x, and let d { = x i — x (i = 1, 2, Consider the case in 

which k of the d' s are negative and the rest positive. Then 1 ^ k < n — 1 . By suitably numbering 
the d’ s the negative ones may be taken as d v . . . , d k ; the mean deviation of the sample is then 

m = n~ l [ — d x — d 2 —... —d k + d k+1 +...+d n ]. 

We now transform the sample space of the x’b by means of the matrix equation Y = TX> 
where X> Y are column vectors (x v ...,# n ), (y v and T is the orthogonal matrix: 

k columns 


1 

1 

] 

> 

sjn 

V» 

n — k 

n — k 

n — k 

\{kn(n - k)\ 

y/(kn{n - k)} 

<J{kn(n - k)} 

k— 1 

1 

1 

sim- in 

Hk-l)} 

- #(i-l)} 

0 

k- 2 

1 

yl{(k-l)(k-2)} 

- y /{(k-l)(k-2)} 

0 

0 

1 1 

V 2 V 2 

0 

0 

0 

0 

0 

0 


n — k columns 


/ — ~ 

l 

A 

1 

" -\ 

1 1 

\ n 


*Jn 

k 

k 

k 

*J{kn(n — k)\ 

y/[kn(n — k)} 

S l{kn{n - k)} 

0 

0 

0 

0 

0 

0 

0 

0 

0 

n — k — l 

1 

1 

V{(« — k)(n — k— 1 )} 

j{(n-k)(n-k- 1)} 

<J{(n — k) (n-k— 1)} 

0 

n — k — 2 

1 

y/{(n — k-l){n-k — 2)} 

,J{(n — k— l)(n — k — 2)} 


V2 V 2 


»• 

i 


3 


0 Q 


3 

I 

?S- 

I 

l—* 

3 

$ 


0 


0 



306 


A further note on the mean deviation 

Then y x = x, and y % — ^n*m/f{k(n — A:)}. Since X = T'Y, T' being the transposed matrix 
of T, the d’s can easily be expressed in terms of the y’s, and this gives, using equations (2), 
that the appropriate region in the space of the co-ordinates y z , y k+1 is a regular simplex 
of side y t \2k(n — lc)ln)l — fynm, and in the space of the co-ordinates y k+t , ...,y n , a regular 
simplex also of side \nm. The frequency function of m is /( to) dm, which is the integral of 
2n~ in ex-p{ — ^(x \+ ... +a£)}, taken through the region in which the mean deviation lies 
between to and to + dm. Since T is an orthogonal matrix, (x\ + . . . + a£) = (y\ + . . . + y%), and 
we may integrate over the variable y x (which is independent of to) from — oo to oo. The 
contribution to /(to) dm from the region described above is 

(2^T-T) e ~ ivi d Vi “ *)} G n -k-i(lnm) 

Now, for a given value of k, there are n C k ways of allocating the k negative d’s and k may 
vary from 1 to n— 1. Hence 

ffy I w 1 r 72-3/// 2 *| 

/(to ) dm = g^-^^^S^exp - G k ^{\nm) G n _ k l (\nm)dm, 

as obtained before (1945). 

Exact descriptions of the sample space for small values of n may be of interest. For n = 2, 
m is constant on two straight lines both perpendicular to the lines x = constant. For n — 3, 
m is constant on a cylinder, whose axis is perpendicular to the planes x — constant, and whose 
cross-section is a regular hexagon (i.e. 3 C 1 4* 3 (7 2 equal lines). For n — 4, m is constant, for 
any space x = constant, on the surface of a cube of side 4m, whose corners have been ‘filed 
off’ to give equilateral triangles of side 2 ^ 2 m. The surface thus consists of 6 (i.e. *C 2 ) squares 
and 8 (i.e. 4 C 1 + 4 C 3 ) equilateral triangles. 


Jo (*? r " 2rC * -r<"exp[ 2k(n - k)\ Gk lit) dt 


4. Moments of the distribution . We now derive a recurrence relation from which the 
moments of the distribution can be calculated. We denote 

nt 2 

[ 2k(n — k)] 

by /(w,r,«) (O 0). (The summation is from 1 to n— 1 when r = 0.) For s^l we have, on 
integrating by parts, suitably grouping terms and, in one case, changing the summation 
variable from k to k+ 1, the successive equations 

k(n — k) 


It \ Pvit 2 tn K n -~ k ) r nP 
Hn, r, „) - J t S - «P [ - jj 


x {(• - 1 ) () (<) + I- * exp - 21 . ( ‘ 1( Oe.,11) 0,^1) 


t* 


/*0O 

-J. 2i 




k l(0 \ dt 


*-*C, 


+t* cxp| 2(n— le— l)(n— jt)J 
iK-exp 

■ r s [ - 2^-t-i) ] °‘- m e ~ i 



H. J. Godwin 


307 


, . k(n-k) , (k+l){n-k-l) . 

and, Funce in ir C k . + - — Ln-tr(j 

n n h 


- / fc-r+l 


(to— 2r) (to— 2r — 1) 


TO 


»-^-n-r + 


r(n—r) 


TO 


- 2 r+X (7 


ft-r+l* 


we have 


/(*. r, ») - J(n , r , 2) + 


n 


n 


(n — 2r)(n — 2r — 1) T/ , r(n-r) T/ 

-f -l(n- l,r,$- 1) + — -/(w- l,r — — 1). (4) 


n 


n 


Now the ath moment of the distribution of m is n*(27r) (2 jn) g I(n, 0, s) and by the aid 

of (4) we can express I(n, 0,s) in terms of certain of the I(n,r, 0), I(n— l,r, 0), etc. (For 
a given value of s , we need values of r for which 0 < 2r ^ 8.) /(n, r, 0) is the integral from 0 
to oo of certain terms in the frequency function of m , viz. w ~ 2r C f fc _ r of those arising when k of 
the d’ s are negative (r^k^n — r). To get the whole frequency function n C k terms are taken 
because that is the number of ways of choosing k negative d’s. But if r of the d’s are fixed as 
positive and r as negative (i.e. a certain part only of the sample space is considered), then 
the remaining (k — r) negative d’s can be chosen in n ~ 2r C lc _ r ways. Hence I(n,r, 0) is the 
integral of ( 27m )~* exp [ — |£# 2 ] over this restricted region of the sample space. In particular, 
I(n> 0, 0) — (2n)^ n ~ 1) n a result which also follows from the frequency function of m. 

To evaluate the integral it is convenient to transform the sample space by the matrix T , 
k being taken equal to 1 .• (This does not imply as much as before about the relation of the 
mean to the other observations, since we shall now only restrict the signs of two of the d’s.) 
For r — 1 , this gives the definition of the restricted region to be 


z 2 >0, 


s l{(n(n-i)} + Z 3 J{n-l) 


> 0 . 


The value of the integral of (2;m) _ *exp [ — |Ez 2 ] over this region bears to its value over all 
space the ratio (i) \r r 4* sin" 1 [1 j(n — 1 )] to (ii) 2 n, since (i) is the angle between the two hyper- 
planes bounding the region. Hence 


/(w,0, 0) 4 2 n n— I 


The calculation of I(n 9 2,0) involves a much more complicated integration. The result 
is that 

T,„ o m (2 to)*<» «r 1 sin Ul/fre- lM + sm-^l/ftt- 3)] 

7(to, 2,0) = — + sV 

5{ (sin-i [1 /(to- 1)|) (si n- 1 fl /(to - 3)])} 

47T 2 


l /•sin~ 1 [l/(»--3)] 

7T 2 Jo 


tan" 1 


— (n — 1 ) (n — 4) tan 2 ^ 
n(n — 3) 



The result can be put in several forms, but this is the one which I personally have found 
most useful for computation. 

For larger r the /(n, r, 0) would presumably be even more complicated; the two given are 
sufficient to determine the first five moments of the distribution. Since the d’s tend to 



308 


A farther note on the mean deviation 


independence for large n and the assignment of sign to a d reduces the sample Bpace by 

one-half, we have that /I /l' 

l(n,r,0). 

By the use of the recurrence relation (4) we can now find the moments and constants of 
shape of the distribution of m; these agree with the expressions obtained by Geary (1936) 
(see also some of his results quoted by Pearson (1945)), except that the coefficient of n~ 6 in 
the expansion of m\ (formula (22), 1936 paper) should be (51 — 352a 2 + (2136/5)a 4 ) and not 
(51 - 352a a + 427a 4 ). This mistake causes the coefficient of w* 6 in the expansion of 
(formula (24), same paper) to be 0*033578 instead of the correct 0*1 14635, and that of 
in the expansion of A 4 (formula (4), 1 945) to bo — 0* 120003 instead of the correct — 0*038946. 

We also have the new result 

0*792218 0-023893 0*097967 0*133384 

H Ha 1 - -'r" t * f- .... 


7V* 


n* 






5. Approximations to the distribution . The exact calculation of tables of the probability 
integral of m is lengthy, and the labour increases as n does. Consequently it seemed worth 
while to investigate the accuracy of approximations to the distribution which would involve 
less work in their computation. This investigation was of an empirical nature and consisted 
of comparing the values given for the percentage points of the distribution of rn for n = 10 
with the true values calculated by Hartley (1945). it was assumed that the approximation 
which is most accurate for that value of n will also be the most accurate for other values. 
The approximations considered were 

(a) the normal distribution with the same first two moments as w ; 

(b) the distribution of the form Km a e~P m2 with the same first two moments as m; 

(c) the Pearson curve (in fact, Type 1) with the same first four moments as m; 

(d) functions of m chosen so that their distributions are more nearly normal than that of m. 

Of these methods, (d) is the most useful; however, a brief description of the others may be 

of interest. 

(а) , as may be seen by the comparison in Table 2 following my earlier paper, is useless for 
small n . Comparing it with ( d ) for n = 1000, it is found that the percentage points are given 
with an error of about 0-01 in the extreme values (0*1 and 99*9 %). 

(б) , when fitted to sample size 10 gives the 99 % point with error 0*01 and the 99-9 % point 
with error 0-02. 

(c) , gives percentage points for sample size 10 correct to three places of decimals for 
cumulative probabilities between 1 and 99-8%. The computation is laborious, however, 
and gets heavier for large n , owing to the difficulty of calculating values of the incomplete 
B-function for large p 9 q. 

(d) is an application of a method due to Haldane (1937), The distributions of the functions 

y = and z = (1 + (m — m)/g) h are considered, where k , g and h are chosen so that for 

the distribution of y 9 is 0(n~ z ), while for the distribution of z , /J x is 0(n r*) and /? 2 — 3 
is 0(n~ 2 ). To find k , g and A, powers of y and z are expanded in powers of (m-m), and their 
products with f(m) dm integrated with m ranging from 0 to oo. The moments of y and z are 
thus found in terms of the moments of m and can be arranged as power series in nr 1 . (In a 
review of Haldane’s paper, Neyman (1938) points out that this method is not rigorous as 
the series in (m - m) are divergent for some values of m ; this does not, however, necessarily 
imply that the expansions of the moments of y and z in terms of n~ l > as far as is needed, are 



H. Godwin 


309 


false. An investigation of the remainder terms of these series would be of interest.) ft, g and 
h are now found so as to make the moments of y and z of the required order. To find g and 
h a knowledge of the sixth moment of m is needed. To avoid calculation of /(n, 3, 0) it was 
assumed that, for the distribution of m , 


/<«/( 15 / 4 ) = 1 + 0(n~ l )\ 

this seems reasonable in view of the fact that /ijfi/il), /t 6 /(10/t 8 /e s ) are both 1 + 0{n~~ l ). 
The results are that y 

is approximately normally distributed with mean 


0*0703 0*0714 0*0818 

n ri l 


and standard deviation 


while 


0*42375 / 0*4808 0* 

— 1 + + - 

\ 


-( 


4099 

n n 2 

m — m \° ,8fl96 
1+ 0 t 6006/ 


is approximately normally distributed with mean 



0*1115 0*0239 

n n 2 


0*0135 

n 3 


and standard deviation 


0*67205 1\ 0*0219 0*0082 

7 1 + + + 

yjn l n n z 



For sample size 10, y gives percentage points differing by amounts up to 0*006 (for 99 %) 
from the true values, while z gives values differing by not more than 0*001. The two methods 
give the same values (to three places of decimals) when n is greater than 35. The labour of 
calculation does not increase with w, and for a percentage point which is frequently needed, 
the equation giving y or z in terms of m can be inverted to give the desired value directly 
as a power series in n~*. 


REFERENCES 

Geary, R. C. (1936). Biomctrika , 28, 295-305. 

Godwin, H. J. (1945). Biometrika , 33, 254-6. 

Haldane, J. B. S. (1937). Biometrika , 29, 392-404. 

Hartley, H. O. (1945). Biometrika , 33, 257-65. 

McKay, A. T. (1935). Biometrika , 27, 466-71. 

Neyman, J. (1938). Zbl. Math . 18, 257. 

Pearson, E. S. (1945). Biometrika , 33, 252-3 (Editorial Note). 


Biometrika 35 



[ 310 ] 


A NOTE ON THE ASYMPTOTIC DISTRIBUTION OF RANGE 

By D. R. COX, Wool Industries Research Association 


1. Introduction and summary 

In a recent paper Elfving ( 1947) has given an asymptotic form for the distribution of range 
in large samples. 

In the present note, two other methods of obtaining an asymptotic form for the range 
distribution are discussed, which have the advantage of being expressed directly in terms of 
the range, while Elfving’s form involved a non-linear transformation of range. 

Numerical results are given for a normal population comparing the exact distribution of 
range with the two approximations discussed here, and with Elfving’s approximation. 


2. Derivation from Fisher and Tippett’s results • 

ON THE DISTRIBUTION OF EXTREMES 

Fisher & Tippett (1928) have obtained results for the distribution of the least and greatest 
members of large random samples. Now in large samples it is clear that least and greatest 
values are effectively independent, and so a form for the asymptotic distribution of range 
can be derived by integration from the joint distribution of least and groatest values.* 
Consider random samples of size n taken from a population whose frequency function is 
</>(x) and whose distribution function is O(x). Define a;^ and by 



$(*<«) =l-~, 

(1) 


*(*£)- 

n 

(2) 

Let 

V\ = n{x x -X$) <!>{*$)> 

(3) 


y 2 = n(za-ag%(a£>), 

(4) 


where x x and x 2 are the greatest and least members of the sample. 

Then Fisher & Tippett have shown that if (j>{x) tends to zero exponentially or faster as 
x tends to infinity, the limiting frequency functions of y x and y 2 are 


exp ( — y x — e~ v i ) and exp (y 2 — e v *) . 
Thus the limiting joint-frequency function of y x and y 2 is 


ex P ( — — e ~ Vl + 2/2 ~ eV ‘) > 


and so if W = y x -y 2 , 

the limiting frequency function of W is 


/: 


exp(— W — e~ w ~ v * — e v *) dy 2 = 2e~ w K 0 (2e~* w ). 


(5) 

(«) 


In equation (0), K 0 (x) is a modified Bessel funotion of the second kind (Watson, 1944, p. 78). 


* Since I wrote this paper, it has been pointed out to me that this method has recently been discussed 
at length by E. J. Gumbel (1947). In the case of a symmetrical distribution, W of equation (5) below 
is termed by Gumbel the ‘reduced range* and denoted by R. For the case of a normal population 
he compares the exact probability levels of the sample range with those derived from the asymptotic 
distribution of equation (0) for which he has calculated certain values of the probability integral. He 
has not, however, considered the rather closer approximation which I discuss in § 3 below. 



D. R. Cox 311 

Suppose now that the basic distribution is symmetrical and has mean zero, so that 

*8 

and 0(4 X) ) = sK*S?). 

Then if is the sample range 


u>n = 2*S?+; 


IF 


(7) 


Thus (6) leads immediately to a form for the asymptotic frequency function of range. 

It can be shown directly from (6), or inferred from Fisher & Tippett’s work, that the mean 
and standard deviation of W in the distribution (6), are 

W - 2 y, (8) 

= (9) 

where y is Euler’s constant. 

Also fi x = 0-806, (10) 

and = 4-2. (11) 

If we combine these results with equation (7), expressing the range w n in terms of IF, we 
find that for large n 

~ 7 ( 12 ) 

(13) 


w -- 2 ^ + ^y 


The limiting ft coefficients of w n are the same as those of W. 

A numerical comparison of these results with the exact results for the normal distribution 
is given in §4. 


3. Asymptotic form by the method of steepest descents 

The exact expression for the frequency function of range involves an integral. For large », 
this integral can be evaluated by the method of steepest descents and this leads to a further 
form for the asymptotic frequency function of range. 

The frequency function, f n (w n ) t of the range w n in samples of size n, is known to be 

t*ao 

/„(«>») = n(n-l)J _jK u -\ w n)$&+ £w>„) [<!>(« + £w n )-<h(tt-£w n )] n -*dtt 

/*co 

= Mn - 1 )J <f>{u - Jm>„) 0(tt + $w n ) exp [(» - 2) ifr(u, «;„)] du, (14) 

where = log [<&(« + $«>„) -<!>(« -£m> b )]- (16) 

It will now be supposed for simplicity that <p(x) is a symmetrical unimodal frequency 
distribution with mean zero; that is, that 

<}>(x) — — 0 {x 4=0). 

Then for fixed w n , the function of u, ft(u, u> n ) has a single maximum at u = 0. The integral 
(14) is thus of the form that can be evaluated by the method of steepest descents (Watson, 
1944, p. 236). 

The basic idea is that as n tends to infinity, by far the greatest portion of the integral ( 14) 
comes from the neighbourhood of u = 0. 


ao-a 



312 


A note on the asymptotic distribution of range 

Write y5r(«, w n ) = ^(0, w n ) - $t 2 , ' (16) 

and then it is found that 

= n(n-l)exv[(n-2)i/r(0,w n )]j ^<f>(u-lwj<f>(u + tyj^exv[-^^^dt. (17) 
The equation (16) can be solved for u as a power series in t, and so the expression 

4>(u-lw n )<f>{u + $w n )~ 

can be obtained as a function of t. 

It is found that 

-\w H )(f>{u + \w„) [ 1 + A 2 (w n ) < 2 + . . . J. 


In this expression A 2 (w n ) is a complicated function of 0(|w„), <j>(\w n ), <f> n (\w n ), whose 

exact form will not be needed, and 


Thus 
/»(«’») = 




n(n -\ ) expj [(n - 2) ^(0,w„ )] [0( \w n )]* 

[-*'(0, »„)]»" 


/: 


[1+A 2 (m>„) <*+...] exp 


(ft — 2) t 2 


(18) 


An asymptotic expansion in inverse powers of (ft — 2) follows if the integral may be evalu- 
ated formally term by term. That this is in fact permissible may be shown by an application 
of Watson’s lemma (Watson, 1944, p. 236). 


Thus f n (w n ) 


ft(w - 1 ) (2ft)* e x p [(n - 2 ) f ( 0, w n )] j 1 ^ a (»J 

[-^"( 0 , m >„)]1 |(ra — 2 )* (»- 2 )« 


w »(2 tt )* [ <f>(bw n )Y 
[ 0 '( ->„)]* 


[0(^ n )-<I)(- 



To obtain the last expression, the definition (15) of ijr(u, w n ) has been used. 
Put T(a;) = <!>(*)— O(-x). 

Then the first term of the expansion (19) is 


ft »( 2 ft )*#( K ,,)] 2 

[f(-K)]‘ 




(19) 


(20) 


( 21 ) 


Now it can be shown that the function A 2 (w„) is negative for the normal and many other 
distributions. A better approximation than (21) is therefore 




( 22 ) 


where in this expression the constant c n is to be chosen to make the integral from zero to 
infinity equal to unity. 

In §4 some numerical results are given comparing (22) with the exact distribution, for 
the case when the basic distribution is normal. 



D. R. Cox 


313 


4. Numerical comparison with exact values 

In this section the approximate results obtained above are compared with exact values for 
the normal distribution. 

Figs. 1-4 give the frequency functions of range for n « 20 and n = 50 when the basic 
distribution is normal with unit standard deviation. The exact distribution was obtained 
from E. S. Pearson & H. O. Hartley's table (1942) for n = 20, and by numerical integration 
for n = 50. 

Table 1 gives the exact values of mean, standard deviation, and fi % of range, compared 
with the values from the formulae (8)— ( 11) of§ 2, and with numerically computed values from 
the approximation of §3 and from Elfving’s approximation mentioned in §1. The exact 
values, based on the work of Tippett (1925) and E. S. Pearson (1926), have been taken from 
Tables for Statisticians and Biometricians , Part II (K. Pearson, 1931, p. cxvii). 


Table 1 . Constants of the range distribution in samples from a 
normal distribution of unit standard deviation 



n 

Approx, of §3 



Elfving’s approx. 

Mean 

8.D. 

A 

fit 

Mean 

8.D. 

A 

/?. 

20 

50 

3- 83 

4- 58 

, 

0*78 

0*69 

0-21 

0-24 

3-42 

3 33 

3-76 

4*52 

0-77 

0-67 

0*07 

013 

308 

3*18 


Figs. 1-4 show that the steepest descents approximation is more accurate than the 
approximation of § 2, although even for n = 20 the latter gives the ordinates of the range 
distribution fairly closely in the centre of the distribution but not in the tails. 

This is confirmed by the moments given in the table. The Bessel function approximation 
of §2 gives the mean range reasonably accurately, the standard deviation less accurately, 
while the /? coefficients are very different (for the range of values of n considered) from the 
exact values. This is in accordance with the results of Fisher & Tippett, who showed that 
for the distribution of extremes the limiting /? values are not reached until n is about 10 12 . 

The steepest descents approximation of §3, and Elfving’s approximation mentioned in 
§ 1 are both more accurate. For n ~ 50 Elfving’s approximation gives the mean range to 





Sample Range -*-{w) Sample Range 

Fig. 3. Exact form compared with Bessel function Fig. 4. Exact form compared with steepest descents 

approximation of § 2. approximation of § 3. 

Fig. 1-4. Distribution of range in samples from normal population of unit standard deviation. 


D. R. Cox 


315 


i % and the standard deviation of range to 3 %. The steepest descents approximation is less 
accurate, the corresponding figures being 1| and 6 %. The /? coefficients appear to be given 
slightly more accurately by the steepest descents approximation. 

The properties of the three approximations may be summarized as follows for the case 
when the basic distribution is normal. 

Elfving’s approximation is the most accurate with the steepest descents approximation 
second. 

The Bessel function approximation gives fairly accurate estimates of the ordinates of the 
range distribution near the mode, but does not reproduce the ordinates in the tails until 
n is very large indeed. 

The disadvantage of Elfving’s method is that it involves a non-linear transformation of 
the range. This makes it unlikely that it can be used algebraically to simplify mathematical 
problems involving the distribution of range. 

I wish to thank the Director of Research, Wool Industries Research Association, for 
permission to publish this paper, and Dr H. E. Daniels for his interest in the work. 


REFERENCES 


Elwino, G. (1947). Biometrika, 34, 111. 

Fishkr, R. A. & Tippett, L. H. C. (1928). Proc. Gamb. Phil. Soc. 24, 180. 

Uumbkl, E. J. (1947). Ann. Math. Statist. 18, 384. 

Pearson, E. S. (1920). Biometrika, 18, 173. 

Pearson, E. S. & Hartley, H. O. (1942). Biometrika, 33, 301. 

Pearson, K. (1931). Tables for Statisticians and Biometricians, Part II. Cambridge University Press. 
Tippett, L. H. C. (1925). Biometrika, 17, 304. 

Watson, G. N. (1944). Theory of Bessel Functions, 2nd ed. Cambridge University Press. 



[ 316 ] 


ON THE ROLE OF VARIABLE GENERATION TIME IN THE 
DEVELOPMENT OF A STOCHASTIC BIRTH PROCESS 

Bv DAVID G. KENDALL, Magdalen College , Oxford 

1. Introduction. The ‘birth-and-death’ process introduced by W. Feller (1939) 
provides a mathematical description of the growth of a population under the influence of 
very much simplified laws of reproduction and mortality. The population size n is considered 
as a random function of the time t, and the equations governing its development are derived 
from the following assumptions: 

(i) The value of n at some initial time {t — 0) is supposed given. In the simplest case 
»( 0 ) = 1 . 

(ii) It is assumed that the subpopulations stemming from two co-existing individuals 
will develop in complete independence of one another. 

(iii) The risks of mortality and reproduction are supposed to be the same for each member 
of the population. 

(iv) An individual known to be alive at time t has a chance asymptotically equal to 
A dt of reproducing itself, and a chance asymptotically equal to /idt of dying during the 
subsequent elementary time interval of length dt. Reproduction is here taken to imply 
binary subdivision, and so to result in the addition of just one member to the population. 

(v) The chances of reproduction and mortality, described at (iv) above, are supposed 
to be completely independent of the previous history of the individual, and in particular to 
be independent of the time which has elapsed since its own ‘ birth ’ in some earlier subdivision . 

(vi) The birth and death rates A and (i are supposed to be independent of the epoch t. 

Assumption (v) implies that the stochastic dependence of n upon t can be described with 

the aid of a discontinuous Markoff process,* and, in fact, it is easily seen that the functions 

P n (t) = Probability {n(t) = n | »(0) = 1), 

which completely determine the structure of the process, must satisfy the set of differential- 
difference equations 

i P n (t) = in + 1 ),i P n+1 (t) + (n - 1 ) A P H ^(t) - n(A + //.) P n (t) (n> 1 ), I 

d ' M 

It P 0 (t) = /iP 1 (t). 

Feller’s equations (1) were solved by C. Palm,f the mean and variance of n as functions of 
t having been already given in Feller’s paper. 

The idealized population whose growth is described by the equations (1) is, of course, 
rather fat removed from reality, and it is therefore of some interest to consider how far any 

* The development of a system is said to follow a Markoff process if its state at any time t can be 
described by the value of a random time -dependent variable X{t) with the following property: let the 
value of X(t 0 ) be known ; then if t x > t 0 , the conditional distribution of the random variable X(t Y ) is in no way 
affected if the value of X(t) is also given for any t < t 0 . Further details and some references will be found 
in my recent review (Kendall, 1947). 

t Palm’s formulae are quoted by N, Arley & V. Borchsenius (1945). See also M. S. Bartlett (1947) 
and D. G. Kendall (1948a). 



David G. Kendall 


317 


relaxation is possible in the assumptions (i)-(vi). Of these the first two and the last two are 
the ones urgently requiring consideration; progress is most likely to be made by attacking 
them independently, and a good deal of work has already been done in this respect. If (ii) 
is retained, (i) is easily relaxed, for the n( 0) initial individuals then generate independent 
subpopulations each commencing with a single member, and it is therefore sufficient to 
raise the generating function for the (P n (J)} to the power n( 0). An interesting extension of 
(ii) has been considered by Feller, who discusses the birth-and-death process for which the 
rates A and fi> instead of being constants, are linearly dependent on the instantaneous 
population size n . This model, the equations for which are as yet unsolved, corresponds to 
a population growing according to the logistic law of Pearl, Verhulst and Reed in the parallel 
deterministic theory. In a recent paper (Kendall, 19486) I have given an account of the 
birth-and-death process in which the birth- and death-rates A and /i 9 instead of being con- 
stants as in (vi), can be any desired functions of the epoch t ; the most interesting example is 
perhaps that in which A and /i are periodic functions of t . 

The one important assumption which has not yet been relaxed is that of the Markoff 
property (v). This implies, for example, that in the absence of mortality an individual ‘ born ’ 
at time t will itself undergo subdivision at a time £ + 7, where the generation time r has the 


distribution 


e~ Ar A dr (0<t<oo). 


( 2 ) 


This is, of course, very different from the distributions of generation time actually observed ; 
not only for man, but also for such elementary organisms as bacteria, the observed dis- 
tribution usually possesses a pronounced non-zero mode, and the other extreme assumption 
of a fixed generation time 7 = r 0 (implying an exact doubling of the population at regular 
intervals) might seem to be more realistic. 

Now the distribution (2) would assert that 7 is a multiple of a ^ 2 -variate having two degrees 
of freedom, and this suggests that it would be worth while examining a modified process in 
which 7 is distributed as the multiple of a xlk> where k is an integer greater than unity. 
The present paper is devoted to a development of this idea, and to simplify the analysis 
attention will here be confined to purely reproductive processes (/i = 0). Before proceeding 
to details it is of interest to note what values of k are likely to be of practical relevance. 

The most extensive information on the distribution of generation times for bacteria 
appears to be that communicated by C. D. Kelly & Otto Rahn (1932). They maintained 
bacteria in a ‘ warm stage ’ at 30° C., and measured (by continuous microscopic observation) 
the fission times for the second, third and fourth generations descended from each of a large 
number of individual cells. Their results concerning Bacterium aerogenes may be quoted in 
illustration. Observations were made, for this bacterium, on nine different days, the 
number of generation times measured per day varying from 30 to 126; the results for the 
three largest sets (each consisting of more than a hundred observations) are shown in histo- 
gram form in the accompanying diagram (Fig. 1). The sets relating to the several days 
have not been pooled because it is evident that there was a day-to-day variation in the 
mean generation time. 

It will be seen at once that the assumption of a xlk distribution (with a change of scale 
to allow for the correct mean time) is at least not an unreasonable one, and that it is greatly 
superior to either of the alternatives previously available (which correspond to k = 1 and 
k =£o). While it will certainly be of interest to examine the adequacy of the xlk hypothesis 
in jhore detail on another occasion, it is enough for the moment to know that it provides 



318 


A stochastic birth process 

a way of introducing into a model of population growth a law of variation of generation time 
which bears a general resemblance to that obtaining in reality. 

It will be useful for the further developments of this paper to have in mind a rough estimate 
of the parameter k derived from the above data. Such an estimate can conveniently be based 




Fig. 1 


Table 1. Bacterium aerogenes at 30° C. Preliminary analysis 
of the observed, generation times, r 


(Data of Kelly and Kahn) 


Date 

No. of 
observations 

Geometric 
mean of r 
(min.) 

* Coefficient 
of variation’ 
(%) 

Estimate 
of k 

17 Feb. 

44 

38*5 

17*6 

33 

24 Feb. 

60 

34-8 

22-8 

20 

2 Mar. 

84 

28-9 

27*7 

14 

3 Mar. 

126 

33*6 

27*2 

14 

6 Mar. 

84 

29*6 

23*0 

19 

10 Mar. 

93 

32*4 

30*1 

12 

17 Mar. 

112 

22*6 

23*9 

18 

12 Nov. 

100 

23*1 

14*6 

47 

14 Nov. 

30 

24*9 

24*7 

17 


on the observed variance of the natural logarithm of the generation' time, which on the x\u 
hypothesis will have the expected value* 

#c 2 (l°g t) = 1 /(£-£), approximately. 

It is worth noting that the observed standard deviation of the logarithm of the generation 
time r is roughly the same thing as the coefficient of variation of r — it has been listed as such 
in Table 1. The observed variances of logr, when analysed by M. S. Bartlett’s well-known 

* See, for example, M. S. Bartlett & D. G. Kendall (1946). (The error involved in this approximation 
is less than 1 % for k> 3.) 



David G. Kendall 


319 


test for homogeneity, were found to be significantly discordant (%f = 62), and no attempt has 
been made, therefore, to pool the nine estimates of k. For present purposes only the order 
of magnitude of k is required, and the value 

k = 20 

is evidently satisfactory, though, of course, there is no reason to suppose that it will be 
appropriate to describe the growth of any organism other than the one example discussed 
here. 

2. Specification of the multiple-phase birth process. The fundamental equations. 

The process to be considered may be specified as follows. When a new individual is ‘born’, 
it passes through a series of phases, k in number, and only after it has attained the &th phase 
can it undergo subdivision. The lifetime in each phase is assumed to follow the law of dis- 
tribution er kxr kXdT (0<T<oo), 

the several lifetimes being independent, and the incident terminating the life of the 
individual in the kth phase being (i) its death and (ii) the birth, simultaneously, of two 
individuals who commence their existence in the first phase at that instant. Of course, 
the formal treatment would not in any way be altered if instead it were assumed that an 
individual terminates its existence in the ith phase by giving birth to one individual in 
the first phase, and simultaneously returning to the latter itself. 

The multiple-phase birth process possesses the Markoff property, provided that its develop- 
ment is described by the vector variate n, the k components of which enumerate the 
individuals existing in each of the k phases. If n s (n v w 2 , . . . , n k ), and 

n = n t + n 2 + ... + n k , 

then n , the total population size irrespective of phase, describes the growth of a birth process 
in which the generation time r has the distribution of 

1 2 

2kX X2k ' 

When k = 1, the multiple-phase process is identical with the simple birth process discussed 
by Feller (i.e. that governed by equations (1), with /i = 0), while if k is allowed to tend to 
infinity, one obtains the deterministic birth process with a fixed generation time r 0 = 1 /A. 
The multiple-phase process thus includes both the models already mentioned, and bridges 
the gap between them. Incidentally its study may throw some light on the effect of the 
* Markoff * assumption in calculations of this sort. In this context, reference may be made 
to the general remarks in my review article (1947). 

It will be supposed that initially the population consists of a single individual in the first 
phase. (Because of a well-known property of the distribution (2), it makes no difference 
whether this individual was born at the time t = 0 or at some earlier instant.) The course of 
events leading up to the first subdivision in what might be a typical case is shown in the 
accompanying diagram (Fig. 2). The stochastic development of the process will be fully 
described once the function 




320 


A stochastic birth process 


Phut 



is known; this is, the probability that at time t there will be n t individuals in the first phase, 
n 2 in the second phase, and so on. The differential-difference equations which correspond 
here to Feller’s equations (1) can be written most concisely if one adopts the convention 
that P== 0 whenever any of the n { are negative. It will then readily be seen that 

/ ** \ 

<| = (» l+ l)*AP • r + 

\«i+l / 

( n k -\ \ j n k +\ 

<J+(n fc +l)MpJ ^ , 

/ \»i - 2 



-(n 1 + n 2 + ... + n A ,)&AP| ' M, 


\n, 


and if one introduces the generating function 

' z * 


/ n k 


\»1 




( 3 ) 


( 4 ) 


it will follow that this must satisfy the partial differential equation 

1 d<b l dd> dd> dd> 

k\dt = v®02i +28 0^ + " ,+ . 2fc a^7 

the associated boundary condition being of course 

<j> = z x when t — 0, 

The partial differential equation (5) is of the standard Lagrangian form, the auxiliary 
equations being 

d<f> _ kXdt _ dz x _ dz t _ _ dz k _ l dz k 

0 1 Z\ Zq Zq Zq 


( 5 ) 

( 6 ) 


z k-\~ z k z k~ z 1 


2 ‘ 


( 7 ) 



David G. Kendall 321 

To solve these, it is convenient to introduce a new time scale defined by JcM = 6, and to write 
X t = e~%; the equations (7) then become 

X\ = — X j+1 (j = 1, 2, 1), XL = -e°Xl 

and so X{ = ^ — ^(^)* (®) 

where X(x) satisfies the ordinary differential equation 

(-^Jx(x) = e*{X(x)}*. (9) 

Let the general solution of (9) be 

X(x) = F(x\c v c z> 

where the arbitrary constants {c { } are so chosen that 

C{ = JW-D(0 ;c v c»...,c k ) (Ui<fc). (10) 

Then X i = Cl ,c 2 , ...,c*), (11) 

and if these equations when solved for {cj} take the form 

Cj ?= G](d‘y Xi, X 2 , ••• i Xi c ) 1 ( 12 ) 

the general solution to (5) will be* 

0( 2 1> %2’ '•'iZ’k'y 0 s ®(®1> (13) 

where ® is an arbitrary function of its k arguments. 


The unknown function® must now be identified with the aid of the boundary condition (6). 
To this end, put 6 = 0 in (1 1 ) and (12) and compare with (10); it will then be seen that 

Gj( 0 ; s (-)*- 1 u y 

From the boundary condition (6), however, it follows that 

-1 = <f>(z v z 2 ,.--,z k ; 0) ==<&{..., z v z 2 ,...,z k ),...} 

= <U( 2 i, -z 2 ,...,(-) k - 1 z k ), 
and so the function ® must be given by 

*{G l9 Q»... 9 a k )mQ v 

The generating function <f> is thus O x {6\ X V X 2 , and this is the value of c v i.e. of 

F( 0; c x , c 2 , ...,c k ), when the {cf) are to be determined from the k equations 

z { e~° = ( — l) < ~ 1 .F (t *“ 1 >(0; c v c 2 , ...,c fc ). 

Expressing this result in different words: <f>(z v z 2 , . z k ; t) is equal to the value of -X(0), when 
X(:r) satisfies the equation (9) together with the boundary conditions 

X(d) = z x e-°, X'(0) = -z 2 e~*, X^\d) - (- )*-»**«-* 

Before working out an example in detail, it is convenient to throw the solution into a more 
familiar form by writings = d — u andX(x) = e~° Z(u). It will then be seen that the generating 
function <f> for the multiple-phase birth process is equal to e~ kXt Z(kM ), where the function Z(u) 
is determined by the differential equation 

(^)*Z(*) = e-{Z(*)}*, (14) 

and the boundary conditions Z w (0) = z i+l (0< t < 1). (15) 


* The arguments of <j> will henceforth, for typographical convenience, be written horizontally. 



322 


A stochastic birth process 


This is a perfectly straightforward orie-point boundary problem, although unfortunately 
the equation (14) seems to be intractable for values of k greater than unity. A determination 
of the P-funotions for the multiple-phase prooess is therefore not practicable; it is, however, 
possible to discuss the mean and variance of the distribution, and its approximate normality 
for large k, without solving the equation (14). These matters will bo considered in the 
following sections of the paper; for the moment it is of interest to leave the general argument 
and examine the solution of the equation (14) in the elementary case when k = 1. 

The equation is then Z' = e -u Z*, with the single boundary condition Z(0) = z. The general 
solution is 1 

and on identifying the arbitrary constant this becomes 

Z(u) = 

Accordingly the generating function is 

and 


l-z(l-e-“)' 


ze 


-M 


P(n, t ) = e-"( 1 - e-' u )»- 1 (n 1 ), 

in agreement with Feller’s original calculation (1939, equation (17) with N = 1). For future 
reference it will be convenient to note here the mean and variance of this distribution, as 
already given by Feller; they are 

E(n) — e A ‘ and Var (n) = e^e*'- 1). (16) 

3. The mean growth of the process. It is possible to deduce the differential equations 
satisfied by the expected values of the {n,} from the fundamental equation (14), but it is 
easier to derive them independently; in fact, simple considerations of continuity give at once 

dv x 


dd 


= Zf'k-Vi, 


and 


dv. 


dO 


1 _ 


(1 <j^k), 


(17) 


where Vj is the expected value of rij. If now one writes Pj for i^e° these equations can readily 
be seen to be equivalent to 


where 


td\ 
p} ~ [do) 
ld\ k 

y p = ^ 


Thus if a) is the primitive kth root of unity, exp fini/Ic), then 

p = j:A r ex p(2VV0) 


r-0 


(where the quantities A r are as yet undetermined constants), and 

k - 1 

pj = 2 1 ~ i t k 2 A r oj^ r exp (2 1/k (if6). 


r-o 


Now initially, when both t and 6 are zero, p x = 1 and pj - 0 ( 1 <j < k ) ; the {-4 r } must therefore 
have the values 1 

A r = 2k 21IIC(tf - 



David G. Kendall 


323 


Accordingly, the expected number of individuals in the jth phase at time t is 
E(nj) m sV«-»'exp{(2^-l )kM}, 

fC r»0 

while the expected total population size at the same time is 

1 Ac— 1 

E(n) = 2S>5, m^i exv{{2 0/ ~ 1)kAt} • 

__ kXt • 

(m&)! (raA; + l)! (mk + k— 1)] 


= e~ kM £ 2 m 

m *“0 


In the last formula, the jth term in each bracket is the contribution from the jth phase. 

These formulae possess two limiting forms of practical interest. The first describes (for 
fixed k) the behaviour of the mean population size for large values of the time t. The other 
limiting form is obtained by fixing t and allowing k to approach infinity (so that the process 
approximates to the deterministic model for which the generation time has the fixed value 
1/A). Care is required in interpreting the results because the two limiting processes do not 
in general commute. 

For large values of t (k remaining fixed) the dominant term in (18) is always the first. 

Indeed, since o it 

2 1/k cos — 1 < 0 when 2 < k < 28, 

the remaining terms represent damped oscillations for these values of k, while for larger 
values of k they do not themselves tend to zero but are still of a lower order than the first 
term. Since lim fc (x i » _ i) = fog*, 

&~>ao 

it is convenient to write 


a k = k(2 1/k — 1 ) (->log 2 = 0*693, as k->ao). 

The expected population sizes for large t are then given asymptotically by 



2 <f-m 

E(nj ) ~ — £ — e a k M , 

(21) 

and 

E(n) ~ e«t A ‘. 

(22) 


It is rather curious that the coefficient of e “*** in (22) has the limit 0*721 as k approaches 
infinity, and not unity as one might have expected. This is an example of the non-commuting 
character of the two limiting processes. 

The dependence of a k upon k is shown by the following short table. As far as the ultimate 
rate of growth of the mean population size is concerned, the variability of the generation 
time evidently ceases to have much effect after k exceeds a value of about 35 (this is the value 
of k for which a k differs from log 2 by 1 %): 

k = 1 2 3 4 5 10 15 20 25 30 oo 

a*= 1-000 0*828 0*780 0*757 0*744 0*718 0*709 0*705 0*703 0*701 0*093 

It is more difficult to discuss the behaviour of E(n) when k tends to infinity and t remains 
fixed. From (20) it follows that E(n) is equal to the expected value of 2 [N/k \ when N is a 
Poisson variable of mean value kAt, and [x] is used to denote the integer part of x. It can 
be shown from this that 2M (23) 

k-+<o 


when At is not an integer ; 



324 


A stochastic birth process 


while lim E(n) = £ . 2* 1 (23a) 

when A t is an integer. The limit of E(n) as k approaches infinity is thus a discontinuous 

function of the time t, whose value at a point of discontinuity is equal to the arithmetic mean 

of the left- and right-hand limits. The details of this calculation, though elementary, are 

a little tiresome, and are perhaps not worth reproducing here. Formula (23) of course 

expresses the fact that in the limit as k tends to infinity the multiple-phase model coincides 

with the deterministic model already mentioned. 

It will be noticed from (21) that for fixed k and large values of t, the expected numbers of 

individuals in each of the several phases will be proportional to 

1, 2-v* 2-*/*, .... 2-®-*». (24) 

When k = 2, it is possible to express the formulae for the expected population sizes in 

closed form : , 1 , 

Efaj) = e _aA< cosh (Atf 2^/2) and E(n t ) = -r-e~* A, sinh(A< 2.^/2). (25) 

V2 


4. The differential equation for the cumulant-generating function. The cumulant- 
generating function for the total population size is 

log2?(e n “) = K(<x,6) = log0(e*,e a , ...,c*; t) = -0 + logZ(0), 
where the function Z(0) is to be determined from the differential equation 

{wf m = e ~W)} 2 . 

with the boundary conditions Z (i) ( 0) = e a (0 < i ^ k — 1 ). 

If in these equations one writes Z — e e + K , the boundary conditions become 

(l + ^j) e K(a * 0) * e a , when 0 — 0 (O^i^k—l), 

or e K(*<o) — e* and j e K{a ' 0) = 0 

from which it is easily seen that 

/a y 

K(oc, 0) = a and LI A>,0) = 0 (l^t^i-1), (26) 

while the differential equation for K{ol,0) which is to be associated with these boundary 

conditions is / a \k 

ll + ~\ = e****>. (27) 


For some purposes it is convenient to write 6 = kT y so that T == Af, and then 

( i+ ^) v - e “' 


while 


K — cl and 




when T = 0 (1 ^i^k- 1). 


(28) 


(29) 


The two most important special cases are, of course, (i) k — 1, and (ii) k-+ oo. In case (i) 
let e K s iff ; then the differential equation becomes 

dM 


dT 


= iff (iff — 1), with iff (0) = e a . 


which is easily found to have the solution 

M “ l -e r (l-e-«)’ 



David G. Kendall 


325 


in agreement with the results of Feller mentioned at the end of §2. On the other hand, in 
case (ii), as k -> oo the operator / j 

( 1+ £i 


_9\ fc 
IcdTj 

(£)■ 


formally approaches the limit exp 

and so the equation (28) takes the form 

K(a t T+ 1) = 2 K(a, T) 

with the initial conditions 

K(oc, 0) = a and K(oc, 0) = 0 (all i ^ 1). 

The solution is K(oc, T) = oc% T \ 

in agreement with the results for the deterministic model. 

5. The variance of n. From the equation satisfied by the cumulant-generating function 
one can readily obtain an equation to determine the variance of the population size n. Since 

e K = 1 +<xk 1 + |a 2 (K 2 -f *f) + 

where the cumulants k v /c 2 , etc., are functions of 6 , it follows that 

(‘-•-a)**'" 2 * 1 ’ 


(30) 


and 


^1 + ( k 2 + *i) ~ 2(x a 4- /cf) + 2 k \. 


( 31 ) 


The first of these equations provides an alternative starting-point for the investigation of 
the mean value of n, given in § 3, while the second is the required formula determining the 
variance. It is to be combined with the boundary conditions 


(^) * 2 < 0) = 0 


In principle the equation (31) could be solved explicitly, but fortunately there is little to be 
gained in carrying out this laborious task. It is enough to notice, first, that when the time 
t is large the dominating term in the complementary function will be a constant multiple of 

e< 2l/ *- 1)0 , (32) 

while the dominating term in the particular integral is 

Thus, since (32) is of a smaller order than (33) when 6 is large, it follows that 


Variance (n) ~ 


J? 2a~\* fi**’ 


= C k ri*> 


(33) 


(34) 


as the time t tends to infinity. The values of C k for the first few values of k are given below: 


k = 1 

C k =1-000 


2 

0*489 


3 

0-324 


4 

0-242 


5 

0-193 


Biometrika 35 


21 



326 A stochastic birth process 

If £ is allowed to approach infinity, 

n 2(log 2) s 0-9609 

k ~ k ’ “ k ’ 

the table shows that this asymptotic formula is true with surprising accuracy even for quite 
Bmall values of k (the error when k = 1 being only 4 %). Accordingly the relation 

C. ofV.(»)~ y|log2, (36) 

for the coefficient of variation of the population size can be used for all values of k, provided 
that n is sufficiently large. In the following section an extension of (35) will be found, which 
can be used whenever one of k and n is large enough. Some values of (36) are tabulated below 
(although the first entry has been based on the exact value C x = 1). 

C. of V. ( n ), for large n 
k = 1 6 20 100 

C. of V. (n) = 100% 44% 22% 10% 

6. The asymptotic form for the distribution of the population size when the 
parameter k tends to infinity. Once again it is convenient to write 0 — kT and e K — M, 
so that T = At; M(a, T) is then the moment-generating function for the population size n, 
and satisfies the differential equation 

with the boundary conditions 

M — e a and j M = 0 (l<i^£ — 1), when T = 0. 

It was seen in the last section that for large T and k the coefficient of variation of n is of 
order k ~ *; this suggests that it may be appropriate to consider the standardized variable 

X s &*(»-2 r ), (37) 

which one might expect to have a non-trivial limiting distribution as k tends to infinity, for 
any fixed value of t. The variable X has the moment-generating function 

E(efi x ) s Mo(/}, T) = exp ( — k*/} 2 T ) M, 
where M is the solution of (36) associated with the boundary conditions 

M = exp (yftfc*) and M = 0 (1 i < k— 1), when T = 0. 

The problem is, therefore, to substitute 

exp (fcty? 2 T )M 0 (/3, T) 

\ 

for M in (36), and then to find the asymptotic form (as k tends to infinity) of the solution 
satisfying the boundary conditions 

M 0 = 1 and M 0 = exp (fik*) exp (-W/J 2 T ) (l^i^k-1), when T - 0. 

(38) 

The formal solution to this problem will now be given. A rigorous treatment would be 
preferable, but one does not immediately suggest itself; it therefore seems worth while 
pursuing the investigation by heuristic methods. 



David G. Kendall 


327 


In the first place, it is convenient to write 

( x \k co x m 

l+j) (3») 

and then to write the fundamental equation (36) in the form 

or, on expansion, jE Q (4^) M ( T ) = i M ( T - 1 )}*- ( 40 ) 

The first few coefficients f m (k) are 

f 0 (k) = 1 Ik, Mk) = 0, 

/«<*)--*, /#) = +i 

/ 4 (&) — + ff,(k) = +\ — %k, 


and on considering the mode of formation of the general term of the expansion* it will be 
found that when m = 2p + 1 the largest term in f m (k) is of order while when m — 2p 
the largest term is ( — ) p 


2*>p! 


k»~K 


(41) 


On the other hand, for all values of m the largest term in 


(A)”‘{exp(^2^)3f 0 (/?,T)}, 

as k tends to infinity (supposing M 0 and its derivatives to remain of finite order in these 
circumstances), is 2 T log 2) m exp (k*/J2 T ) T). 

Thus if lira M^i/3, T) = M^/J, T), 

k-* oo 


it will follow that = p ?0 ~pf {miQg 2) * 2iT}V 

= exp { - ^(log 2) 2 2 2T }, (42) 

and the most general solution of this functional equation (when T is an integer) is easily 
seen to be M 1 (/?, T) — exp {/?*(log 2)* 2 2T + A 2 T ), 

where the constant A is to be determined from the initial conditions. Now 


0) = if 0 (/?,0)= 1, 

and so when T is an integer, 

W, T) - exp {£/?* 2(log 2) 2 2 T (2 r — 1)}. (43) 

But this is the limiting form of the moment-generating function for the standardized random 
variable (37). Accordingly the above argument indicates that as k tends to infinity the popula- 
tion size n is asymptotically normally distributed with the mean value 


and, the variance 

for every fixed integer value of Af. 


n = 2 *‘ 


2(log2)* 

k 


»(»-!), 


(44) 

(45) 


* It is advisable to calculate the f m (k) from the successive expansion of exp {log A(x)}. 


21-2 



328 


A stochastic birth process 

The rather unexpected agreement with formula (35) of § 5 will be noticed. It appears, in 
fact, that the approximate formula, 

C. ofV.(n) ~ log2^y ( 1- 4)*, ( 46 ) 

is valid whenever one of k and n is sufficiently large. The limiting values of (46), for large n , 
are illustrated by the second table in § 6. 

The purely formal character of the preceding argument is clearly responsible for the 
difference between (44) and the earlier, precise, result (23a). It is to be expected that the 
result (45) may be similarly incomplete, and a more careful analysis would be of interest. 

7. The coefficient of variation of the population size for more general types 
of stochastic birth process. It has been shown in § 5 that for the multiple-phase birth 
process q of V. (n) ~ A k /k *, when n is large, 

where the coefficient A k is equal to unity when k = 1 and has the limit 0-98 as k approaches 
infinity. On the other hand, it easily follows from the definition of the process that 

C. ofV.(r) = 1 /Jfe* 

(where r, as usual, denotes the generation time). The similarity of the two results is striking, 
and their relationship will be made clearer by the following crude argument, a precise 
formulation of which is yet to be found. This gives reasons for supposing that the relation, 

C. of V. (n) ~ 0*98 C. of V. (r), when n is large, (47) 

is true for all stochastic birth processes of the general type discussed in § 1 , whatever the form 
of the distribution of generation time , provided that C. of V. (r) is small enough for the process 
to be equivalent, in regard to its mean growth, to the deterministic process for which 

n — 2 At , 

where 1 /A is the mean generation time for the actual process. To cast the subsequent argument 
into a rigorous form it would be necessary to pay careful attention to the ‘spreading’ of 
generations; this will not be attempted here, and the conclusions should be regarded as 
tentative only. 

Suppose then that t (and so also n) is so large as to make it ‘practically certain' that the 
first g generations have been completely established.* 

Let T {j ( i = 1,2, ...,0-1; j = 1,2 2* 1 ) 

be the time which elapses between the ‘ birth ’ of an individual (to be identified by the suffix j) 
in the ith generation and its own later subdivision. Thus, for example, r u is the epoch at 
which the initial individual subdivides, to be replaced by the two individuals of the second 
generation which themselves subdivide at the epochs t u + t 21 and T n + r 22 respectively. 
When the ^th generation has been established its 2 g ~ l members will continue to generate 
20” 1 independent subpopulations during the intervals of time severally left to them until 
the final count is taken at the epoch t. If any one member of the grth generation is ‘born’ 
at the epoch u , it will have a time (t — u) available for continued subdivision, and the size of 
the subpopulation which it generates will be (in the deterministic model) 

2 Mt—u)' 

* One (artificial) way of making this statement precise would be to impose an upper bound to the 
possible values of the generation time. This, however, would exclude the multiple -phase process. 



David G. Kendall 


329 


It is a principle of the present calculation that once the first-order fluctuations have been 
identified their coefficients can be calculated on the basis of the deterministic model. A 
fluctuation Su in u thus implies a fluctuation 

- A<fo2 A( *~ u) log2 

in the expected size of the subpopulation developed at the epoch t> and on inserting the 
‘deterministic’ value for u in the coefficient, this becomes 


— log 2. 

Now Su is the sum of the fluctuations in the generation times for each of the specified in- 
dividual’s (g- 1) ancestors. If for the moment the process is imagined to develop strictly 
according to expectation once the gth generation has been established, it will be seen that 
the contribution of the fluctuation of to the final population size n will be 

- 2 flr ~ 1 A St u 2 A ^ +1 log 2, 

multiplied by the fraction of individuals in the gt\\ generation who possess (i, j) as an 
ancestor. This fraction is 1 / 2 **" 1 , 

and so = — A 2 A< “ ,+1 log 2. ( 48 ) 

CTij 

The total contribution to Var (ft) from the fluctuations possible during the development of 
the first g generations is thus 

A 2 (log 2) 2 Var (r) { 1 ( 2 A ') 2 + 2(2 A * J ) 2 + 2 2 (2 A ' 2 ) 2 +... + 2* -2( 2 Af-0+2)2} 

= 2A 2 (iog2) 2 Var(T)2 2 «jl-^iJ. 

To this must be added the further contribution arising from fluctuations in the development 
of the ( g + l)th and later generations. Since this is equivalent to the growth of 2°~ l indepen- 

j , the additional contribution to the variance of n 

will be 2°~ l 2A 2 (log 2) 2 Var (r) 2 2A/ ~^+ 2 , 

to t he first order (adopting the usual iterative procedure). If now n is large enough to justify 
the use of a sufficiently large g, it will follow that the second contribution is negligible and that 

C. of V.(n) - 2* log 2 C. of V.(t), 

as required. Since 2* log 2 = 0 - 98 , it is therefore suggested as a practical rule-of-thumb that 
the coefficients of variation for the population size and the generation time are approximately equal. 
Of course, if the initial population size N is greater than unity this estimate of C. of V. (n) 
must be divided by ^JN , since each of the N initial members will generate independently 
a subpopulation to which the result will apply. 


dent populations during a time 1 1 


T 


t-J 

A 


Summary 

This paper presents a mathematical account of a stochastic birth process in which the 
generation time (the interval between ‘birth’ and ‘parenthood’ in the life of an individual) 
is distributed like a ^-variate with 2k degrees of freedom. When k = 1, the process reduces 
to the simple stochastic birth process in the form originally introduced by W. Feller ( 1939 ), 
while as k tends to infinity the process assumes a strictly deterministic form in which the 



330 A stochastic birth process 

population undergoes an exact doubling at regular intervals. It is suggested that for inter- 
mediate values of k (say of the order of 20) this new ‘multiple-phase * stochastic birth process 
represents a further step towards the construction of an adequate mathematical model of 
the growth of real populations of elementary organisms. 

In order to show that the methods of the paper are relevant in an actual biological example, 
a brief discussion is given of the work of Kelly & Rahn (1 932) on the distribution of generation 
times for Bacterium aerogenes . 

The paper concludes with a sketch of a more general argument suggesting that for a wide 
class of stochastic birth processes the coefficient of variation of the population size is ulti- 
mately approximately equal to the coefficient of variation of the generation time, when this 
is sufficiently small for the process to approximate to the deterministic form, the population 
being assumed to have developed from a single ‘ancestor* in the absence of ‘mortality*. 

In conclusion, I should like to express my thanks to Mr D. J. Finney for kindly making 
available to me the computing facilities at his disposal, during the preparation of this paper. 

REFERENCES 

Arley, N. & Borchbenius, V. (1945). On the theory of infinite systems of differential equations and 
their application to the theory of Btochastic processes and the perturbation theory of quantum 
mechanics. Acta Math . 76, 261-322, esp. pp. 298-9. 

Bartlett, M. S. (1947). Stochastic Processes. (Notes of a course given at the University of North 
Carolina in the Fall Quarter, 1946.) It is understood that copies of these notes are available on 
request. 

Bartlett, M. S. & Kendall, D. G. (1946). The statistical analysis of variance -heterogeneity, and the 
logarithmic transformation. J . R. Statist. Soc . Supply 8, 128-38. 

Felleb, W. (1939). Die Grundlagen der Volterraschen Theorie des Kampfes urns Dasein in wahr- 
scheinlichkeitstheoretischer Behandlung. Acta Biotheoretica , 5, 11-40. 

Kelly, C. D. & Rahn, O. (1932). The growth rate of individual bacterial cells. J. Bacteriol. 23, 147 -53. 
Kendall, D. G. (1947). A review of some recent work on discontinuous Markoff processes with 
applications to biology, physics, and actuarial science. J. R. Statist. Soc. 110 , 130-7. 

Kendall, D. G. (1948a). On some modes of population growth leading to R. A. Fisher’s logarithmic 
series distribution. Biometrika , 35, 6-15. 

Kendall, D. O. (19486). On the generalized 4 birth -and -death ’ process. *4nn. Math. Statist. 19 , 1-15. 



[ 331 ] 


2x2 TABLES; THE POWER FUNCTION OF THE TEST ON 
A RANDOMIZED EXPERIMENT 

By E. S. PEARSON and MAXINE MERRINGTON 

I. Introductory 

In his discussion of significance tests for 2 x 2 tables, Barnard (1947) has pointed out how 
data classified in the form of Table 1 may appear as the outcome of a number of different 
types of investigation. Differences in point of view which have been expressed regarding 
the handling of the figures, concern the probability constructs by aid of which the bare 
numerical data recorded in the table provide a basis for inference. Two lines of approach 
may be distinguished. 

Table 1 



Col. 1 

Col. 2 

Total 

Row 1 

a 

c 

m 

Row 2 

b 

d 

n 

Total 

r 

8 

N 


Following the first, it is considered that for all the types of problem,* the relevant 
information on the points at issue may be obtained by comparing the observed pattern 
of cell contents (a, b,c,d) with the set of possible patterns, all giving the marginal totals 
actually found in the sample. Thus there is only one degree of freedom among the four cell 
frequencies, and the relevant probability distribution is obtained from the hypergeometric 
series. This approach can be derived from Fisher’s information theory. But without using 
this theory, a may be referred to the one-dimensioned, conditional set as a convenient 
practical device, which avoids the introduction of nuisance parameters. 

From the point of view of the second approach it is an over-simplification to treat every 
case providing data in the form of Table 1 as a problem of sampling with fixed marginal 
totals. It is suggested that the readiness of the mind to assimilate the information provided 
by the statistical analysis depends on the directness of the relation between the theoretical 
probability set and the random process of selection introduced in collecting the data. Since 
the random procedure may have entered in different ways, the appropriate probability 
constructs may be expected to differ. 

It can be argued that, except in the case of small samples, the difference of approach is 
practically unimportant, and that even here, until tables of the kind which Barnard has in 
mind are available, the statistician will be forced to draw his conclusions from a table of the 
conditional distribution, such as that recently prepared by Finney (1948). But there is 
another aspect to the matter. The published discussion has hitherto been concerned primarily 
with the sampling distribution of a statistic under the null hypothesis. If we go beyond this 

* Three types were discussed by Barnard (1947) and Pearson (1947). 




332 


2x2 tables 


and consider the sensitivity of the test, that is to say, its power to detect differences if they 
exist, it is at once clear that all problems cannot be treated in the same manner. 

In a recent paper, Patnaik (1948) has considered this aspeot of what one of us (Pearson, 
1947) has termed Problem II and of what Barnard termed the 2x2 comparative trial. This 
occurs when we inquire whether the probability of an individual bearing a given character 
A, is the same in two large populations from which random samples of size to and n, respec- 
tively, have been drawn. Here, two separate random selections are involved, and the cell 
contents may be described as having two degrees of freedom. If the two population pro- 
babilities are unequal, i.e. p x (A) +p a (-4), the chance that the test will establish a difference 
at a given significance level a is a function of p x and p 2 , of to and n and of a. This relationship 
was explored by Patnaik. 

In the present paper we shall consider what Pearson termed Problem I and Barnard the 
2x2 independence trial, from this aspect of the power of the test. In this case, only a single 
process of random selection or partition is called for. 

2. Statement of the Problem 

For convenience we shall describe the type of experiment we have in mind as one in wliich 
two ‘treatments’, say A and B, are compared; the response is a quantal one, so that an 
individual either ‘reacts’ or ‘fails to react’. The applications suggested by these terms lie 
in the biological field, but there is no difficulty in translating the terms of the theoretical 
picture to fit a case where, for example, the individual is a shell, the two treatments are two 
types of fuze and the reaction is successful perforation of a steel plate. 

In this Problem I, the N individuals available for experiment are divided by a random 
partition into a group of to which receive treatment A , and a group of n = N - m which receive 
treatment B. It is then observed that a/m and bjn react in the specified manner. The experi- 
ment is self-contained and the random process under complete control; but without further 
assumptions or knowledge, the inferences that are possible relate only to the reactions of 
the N individuals to the treatments. This may be all that is called for. Inferences of wider 
application may be drawn by assuming that the N have been sampled randomly from a 
population in which we are interested. Or, as is often the case, the experiment may 
be one of a related series, each experiment in which is self-contained. These, taken 
as a whole, can form the basis of reasoned conclusions regarding the treatments, 
conclusions which are not dependent on all groups of individuals having been drawn 
randomly from a unique population in the rigorous statistical sense. This is the case if the 
tests are applied to laboratory animals whose susceptibility -may change somewhat from 
time to time. Or, as Barnard has suggested, when an open-air gunnery trial runs over several 
days of inconstant weather. 

The question then is this: confining attention to the group of N individuals, in what sense 
is it possible to interpret a difference in treatments and how can we measure the power of 
the test to detect such a difference? Perhaps the most general method of regarding the 
problem is to suppose that the N individuals fall into four classes: 

(i) those who would react if given either treatment, X in number; 

(ii) those who would react only if given treatment A, W in number; 

(iii) those who would react only if given treatment B, Y in number; 

(iv) those who would react to neither treatment, Z in number. 



E. S. Pearson and Maxine Merrington 


333 


In this way w© recognize that every individual does not respond in the same way to a 
given treatment; that the success of a shell in perforating a plate will depend not only on 
the fuzing, but on other factors such as strength of shell-case, angle of yaw on striking, the 
position of the strike on the plate, etc. These latter factors are purposely randomly associated 
with the two types of fuze under trial. 

If m individuals are selected randomly and assigned treatment A and the remaining n 
assigned treatment B> the resulting partition will be that shown in Table 2a; as, however, 
we can only observe whether an individual has reacted or not, the figures available for 
analysis will be in the form of Table 26. 


Table 2 a 


Treatment A 


Treatment B 


Total 


React 

Only react if given 

React to 


if given 



neither 

Total 

A or B 

A 

B 

treatment 


x i 

w x 

Vi 

z i 

m 

x g 


2/2 

z 2 

n 

X 

W 

Y 

Z 

N 


Table 26 


Treatment A 


Treatment B 


Total 


React 

Fail to react 

Total 

x x + w x = a 

y i+*i = c 

m 

x t + y t -b 

t/ ' ji ■{■ z 2 — d 

n 

'• 

X + w t + y 2 = r 

Z + w % + y 1 = 8 

N 


The usual null hypothesis is that the treatments are identical as far as producing a reaction 
on these N individuals is concerned, i.e. it is the hypothesis that W = Y = 0. The test of 
significance of departure from hypothesis would be applied to the 2x2 Table 26, and in its 
exact form consists in referring a = x x + w x to the appropriate hypergeometric distribution, 
with parameters r, N and m. Here, if the sample is not too large, Finney’s (1948) table is 
applicable. More approximately, we can either regard 
a — rm/N 




mnrs \ 

[iV-ljJ 


as a unit normal deviate, or 

[N 2 (N 

(ad - be) 2 N 
mnrs 

making a correction for continuity, if necessary. 


as a x* with 1 degree of freedom, 



334 


2x2 tables 


It should be noted that if treatment A were regarded as more successful than B when 
X + W>X+Y, then the null hypothesis to test would be that IF — Y = 0. The expectation 
of the difference aim — bln is still zero if W = T^O, but its sampling distribution under 
random partition can no longer be determined from the marginal totals of Table 26. 
The experiment as planned is not, in fact, able to distinguish between the oases W => Y = 0 
and W— Y = 0; to do so would involve applying both treatments to the Bame individuals, 
which will usually be impossible in practice. 


Treatment A 


Treatment B 


Total 


Table 3 a 


React 
if given 

A or B 

React 
if given 

B 

React to 
neither 
treatment 

Total 


Vi 


m 


2/a 


n 

X 

Y 

Z 

N 


Table 36 



React 

Fail to react 

Total 

Troatment A 

x x = a 

y i+*i = c 

tn 

Treatment B 

x t +y* = b 

z % — d 

n 

Total 

X + y t = r 

'A + y x - a 

N 


In the discussion which follows we shall suppose that IF = 0. This may narrow the field 
of application, but not too seriously. If A is an old treatment and B a new one which it is 
hoped is an improvement,* we are assuming that B has at least the qualities of A. Thus if 
B aims at the cure of a disease and A is a control corresponding to ‘ no treatment ’, we assume 
that in no case will B prevent a recovery which would have takeh place without any treatment 
at all. With IF = 0, we have the scheme of Tables 3a and 6. In this case the null hypothesis 
is that Y = 0; the alternative that interests us is that Y > 0, sq that the statistical test 
applied to Table 36 involves comparing the observed value of a with its lower significance 
level, or 6 with its upper level. This critical limit for a may be written a{a.,r,N'm), where 
a is the chance of falling at or below the limit if Y = 0. It can be determined either precisely 
from the hypergeometric or from the normal approximation. If the null hypothesis is not 
true, r = X + y t will vary from one random partition to another. Thus the problem is to 
determine the probability that, when Y + 0, a < a(a, r, N, m), where both a and r are random 
variables. 


* For purposes of discussion, we take ‘reaction’ to be good. 




335 


E. S. Peabson and Maxine Mebkeno;TOn 

A numerical illustration will help to make the position clear. Thirty small disks were 
placed in a box, of which X = 10 were coloured red, Y = 10 coloured green and Z = 10 
coloured yellow. The disks were divided randomly into two groups A and B each containing 
m ** n = 15. In terms of treatment comparisons, reds ‘respond to treatment’ if they fall 
into group A , reds and greens if they fall into group B. Four of many possible partitions are 
shown in the 2x3 tables of row I below; under these are given in row II the corresponding 
2x2 tables which contain the numerical data available to the experimenter. He cannot 
distinguish between greens and yellows in group A or between reds and greens in group B. 
The null hypothesis is that there were no green disks in the box of 30. Below the 2x2 tables 
are shown (III) the critical limits corresponding to a nominal 5 % significance level, (IV) the 
true level or sum of the tail terms and (V) the conclusion that would be drawn. It may be 
noted that this experimental partition was repeated 50 times and that significance was 
established, i.e. the presence of green disks inferred, in 20 of these partitions. This is a result 
which, as will be shown below, agrees closely with expectation. 



Colour 

. R. 

G. 

Y. 


R. G. 

Y. 


R. 

G. 

Y . 


R 

. G. 

Y 


I 

A 

3 

6 

6 

15 

5 5 

5 

15 

5 

4 

0 

15 

7 

3 

5 

15 


B 

7 

4 

4 

15 

5 5 

5 

15 

5 

6 

4 

15 

3 

7 

5 

15 



10 

10 

10 

30 

W 10 

10 

30 

10 

To 

10 

| 30 

To 

To 

10 

30 

11 



3 

12 

15 

5 

10 

15 


5 

10 1 

15 


7 

8 

15 




11 

4 

15 

10 

5 

15 


11 

4 

15 


10 

5 

15 




14 

16 

30 

15 

15 

30 


16 

14 

30 


17 

13 

30 

III 

Rejection level, 


°o 

= 4 


«o 

= 4 



(1 Q ~ 

= 5 



a o — 

5 



given r 


IV P{a ^ a 0 1 r} 0-0328 0-0134 0-0328 0-0127 

V Significant Yes No Yes No 

The probability that a^a(a,r,N,m) when 7 + 0 represents the power of the test in the 
sense of Neyman & Pearson, i.e. the probability of establishing significance at the 100a % 
level when Y > 0. It will be a function not only of Y but of X (or Z). Owing to discontinuity 
in the distribution of a, it is, of course, impossible to choose o(a, r, N, m) with the same value 
of a for all r. In our calculations given below we have used what may be termed nominal 
5 and 15 % levels, such that the critical limits a(a,r,N,m ) are the highest integer values 
satisfying the inequalities 

P{a^a(a,r,N,m)} (a = 0-05 and 0-15). (1) 

We have taken m = n = \N, supposing that each treatment is given to the same number 
of individuals. This will usually be the case in a planned experiment with randomization, 
and as our object is to illustrate certain points which we believe are of general interest, no 
exhaustive tabulation is called for. 

The method of calculation leading to Figs. 2-7 is described in the following section and 
a discussion of results is given in § 4. 



336 


2x2 tables 


3. Method of Calculation 


In a 2 x 3 table of fixed margins , there are two degrees of freedom . W e shall take as independent 
variables x x and y v The probability of a particular partition of Table 3 a is 


P(* vVi) = 


_ m\n\X\Y\Z\ 
x x \x 2 \y x \y 2 \z x \z 2 \N\' 


( 2 ) 


where -;r 2 , y 2 , z x and z t can all be expressed in terms of x x , y x and the marginal totals. For 
mathematical convenience we may regard the partition of Table 3 a as obtained in two steps, 
the first determining y x , the second x v In fact, the expression (2) may be written as a product 


of two parts 


X ! Z\ (x x + z x )\ (x 2 + z 2 ) ! 

yi\y 2 \{m-y x )\{n-y 2 )\N\ r ' x^x^z^z^lX + Z)\ 


mln\Y\(N-Y)\ 


x 


( 3 ) 


The first factor, say p(y x ), is the probability of obtaining y x individuals with a character 
A in a sample of Y drawn randomly, without replacement, from a population of N in- 
dividuals of whom m have a character A. The second factor, say p(x x \ y x ), is the probability 
of obtaining x x with A in a sample of X drawn, without replacement, from the remaining 
N —Y individuals, of whom m — y x now have A . Thus 


p&iVi) = p{y i) x p (* 1 1 Vi), (•♦) 

and while p{y x ) is the marginal distribution for y x of the joint distribution p{x x y x ), p(* 1 | //, ) 
is the relative probability distribution of x x in the array of constant y v Both p(y x ) and 
p(x x | y x ) are of hypergeometric type; the joint distribution has been discussed by K. Pearson 
(1924) who termed it a double hypergeometric series and considered how it might lead to a 
frequency surface.* He gave some of the momental constants oip(x x | y x ) which we shall use 
below; in our present notation, and for the special case m = n = J.Y, these become 


Mean (x x \ y x ) = Xp, 
Variance (x x | ?/ 1 ) = Xpq 


Z 

X + Z-V 


(« r >) 

(6) 


I Vi) = 


(p-q)*{l-2(X-l)l(X + Z-2)}* 


Xpq 1-(X-1)/(X + Z-1) ’ 
where p = 1-q = (m- y x )/(N -Y) = (x t + z x )l(X + Z). 

The distribution p{x x y x ) is bounded by certain limiting lines, thus: 

y x ^Y, m^x x + y x ^m- Z. 

Fig. 1 illustrates the position for the case 


( 7 ) 

( 8 ) 

( 9 ) 


X = 10, Y = 15, Z — 5, m = n = \N — 15. (10) 

In applying the test of significance to the 2x2 Table 3 b, we shall reject the null hypothesis 
(Y = 0) when x x = a^a(a,r,N,m). (11) 

But r = X+y 2 = X+ Y-y x . (12) 

Hence for given X and Y there will be a critical limit in each y x array, such that if x x fall 
below the limit, the hypothesis will be rejected. For the marginal values given in (10), 


* He had illustrated the distribution many years before (K. Pearson, 1896) in connexion with the 
correlation between the number of cards of a given suit held by two players at whist. 



E. S. Pearson and Maxine Merrington 


337 


Fig. 1 shows these rejection limits as a connected line for the nominal level a = 0-05.* The 
power of the test for a given X and Y is then the sum of the probabilities p(x \y x ) taken over 
the points of the field outside (or below) this critical boundary. Fig. 4 shows that for this 
particular case the sum is about 0-85. 



Fig. 1. Illustration for case of partition of 30 individuals int-o two groups, m = n = \S = 15; X = 10, 
y = 15, Z — fr; nominal significance level, a = 0-05. The null hypothesis will bo rejected at 
the (x lt */,) points marked #, where x t = a, y 1 = 25 — r. 

For small samples the calculations are not very tedious if undertaken systematically, but 
they become so when N is large. In the case N = 50 we have therefore used an approximation 
based on assuming that the array distributions p(x t | y x ) can be represented by normal curves 
with the means and variances of equations (5) and (6). Since, using (8), when m = |iV, 

<l-P = Vyi~ Y)I(N-Y), 

where the expectation of y 1 is 1 Y, it follows that the of the array distributions which con- 
tain most of the frequency will be small provided that neither X or Z are too small. The 
accuracy of this normal approximation is illustrated below. 

The calculations on which Figs. 2-7 have been based may be summarized as follows: 

(i) N — 20 and 30 

Full calculations of the terms of the hypergeometric series 

~a\b\c\d\~N\ 

were made for all values of r, and hence the levels corresponding to the nominal a — 0*05 
and 0-15 were found. This determined the position of the boundary of the critical region as 
illustrated in Fig. 1 and also provided the terms of the series p(y l ), The expressions p(x x y x ) 
corresponding to points in the critical region were then found and summed for selected values 
of X , Y and Z> giving the ordinates of the power curves drawn in Figs. 2-5. 

♦ These limits will be found to agree with those in Finney's (1948) table. The relevant section is 
that for A = 15 = B (in his notation); further, since he takes a>b his (i)a and (ii) 6 correspond to our 
(i) 25 — (aJi + t/d and (ii) See Appendix, p. 345 below. For example, points in Fig. 1 on the 
diagonal x x + y x = 12 make Finney's a =13 and for this value he gives the 5 % significance level as 
5 = 7; thus all points on this diagonal for which fall in the rejection region. 



Chance of establishing significance Chance of establishing significance 


338 


2x2 tables 




Fig. 3 












Chance of establishing significance Chance of establishing significance 










340 


2x2 tables 




Ratio Y/(Y+Z) 
Fig. 7 






E. S. Peabson and Maxine Mebbington 


341 


(ii) Check for N = 30 

For a number of eases the normal approximation referred to above was also used. This 
involved finding for each array the integral under the normal curve with mean and variance, 
(5) and (6), beyond the point a(a, r, N, m) + 0-5, using a correction for continuity. These 
integrals were multiplied by the corresponding exact value of p(y x ) and summed for all 
arrays. A comparison of exact and approximate results is shown in Table 4. From our present 
point of view the agreement may be regarded as good. 


Table 4. Approximate solution for power function. Case m = n = = 15 




a = 

0-05 

a = 

0-16 

X 

Y 













True power 

Approx. 

True power 

Approx. 

• 6 

2 

0-047 

0053 

0121 

0126 


5 

•135 

•139 

•321 

•324 


10 

•413 

•414 

•764 

•758 


15 

•852 

•844 

•984 

•979 


20 

•999 

•997 

1-000 . 

•999 

10 

2 

0-039 

0-043 

0-166 

0170 


0 

•155 

•161 

•420 

•421 


10 

•413 

•415 

•745 

•740 


14 

•773 

•767 

•962 

•956 


18 

•989 

•989 

1-000 

1000 

15 

1 

0033 

0036 

0136 

0140 


3 

•064 

•067 

•224 

•227 


6 

•152 

•157 

•419 

•420 


9 

•326 

•329 

•676 

•673 


12 

•655 

•651 

•913 

•908 


14 

•951 

•948 

•990 

•989 


(iii) N = 50 

Encouraged by the check for N = 30, the approximate method was employed throughout 
in this case, a few checks only being made by the full method, which now becomes rather 
laborious. The resulting power curves are shown in Figs. 6 and 7. 

As N increases above 50, it is probable that further simplifying approximations could be 
introduced, but these we have not explored. It would also be possible to form diagrams 
giving contours of constant power, similar to those provided by Patnaik (1948) in his 
Figs. 2 and 3, for the case where samples are drawn independently from two populations. 
The parameters corresponding to his p x and p 2 would now seem to be X/N and (X + Y)/N, 
the proportion of individuals among the N who would respond to treatments A and B, 
respectively. A preliminary examination on the basis of the results used to form Figs. 2-7, 
suggests that the contours corresponding to various values of N, a. and the power, may all 
belong approximately to a single family of curves, as was the case in Patnaik’s problem, 
for m — n. 


Blometrlka 33 


22 




342 


2x2 tables 


4. Discussion 

In the first place some comment is necessary on the significance levels chosen. Owing to 
the discontinuity which occurs because the cell contents can assume integer values only, 
the standard levels of 10, 5, 1 %, eto., have not their ordinary meaning. As an example, take 
the case where rrt = n = \N = 25, r = 16. On the null hypothesis we have: 

Chance that a ^ 6 is 0* 1 82. 

<5 is 0*064, 

<4 is 0*016, 

<3 is 0*003. 

Thus if we wish to accept no mare than a 1 in 20 risk of rejecting the hypothesis when it is 
true, we should only do so when a < 4. But in practice, if prepared to accept a risk of about 
1 in 20, we should clearly take a = 5 as the limit. Similarly, we should take a = 4 for a risk 
of about 1 in 100. Thus we should tend to base our conclusions on the aotual tail sum found 
after the data are collected. But if we use the power function concept to inquire in advance 
how large N must be to make an experiment worth while, then we must think in terms of 
a specific upper limit or nominal level a. Hence it is important to know how much on the 
average the true level falls short of the nominal. Table 5 has been prepared to indicate this 
difference in the cases with N = 20, 30 and 50 and for nominal levels of a = 0*05 and 0-15. 
Finney’s (1948) table also brings this out.* 

These upper limit values are well on the safe side, and this may be what is wanted if we 
attach prime importance to Neyman & Pearson’s first kind of error; i.e. to the risk of assuming 
a difference when none exists. But where it is of first importance to find a new solution, 
e.g. an improvement in treatment, higher risks in this direction will be accepted in order to 
avoid the chance of falling into the second kind of error, i.e. of overlooking a real difference 
when it exists. For this reason our calculations have been made for a — 0*05 and also for 
the rather high value a = 0*15, which, as can be found from Table 5, means on the average! 
a true level of a = 0*057 for N = 20, a = 0*082 for N = 30, a = 0*089 for N = 50. 

Turning to the interpretation of the diagrams, it must be emphasized again that the 
frequencies X , Y and Z will of course not be known ; the purpose of the charts is to show for 
certain sample sizes, the combinations of these three frequencies needed to give a reasonable 
chance of establishing a significant treatment difference. If the investigation has the positive 
objective of establishing the value of new methods, it will naturally be hoped that the com- 
parative experiment will establish statistical significance. If, with the knowledge available, 
it can be shown that an experiment of given magnitude is very unlikely to lead to a signi- 
ficant result, then it may be a waste of time to proceed on this scale. We think that the results 
presented in the diagrams will be helpful in this type of review. 4 

If treatment A has already been in use, some information will be available as to the likely 
value of X/N, the proportion of individuals in a group of N who would react to A. We shall 
at least know whether it is more likely to be 0 or The ratio Yj(Y +Z) used as the abscissa 
for the power curves measures the headway which the new treatment B could make in 
causing a satisfactory response among individuals with whom treatment A fails. Again, 
the experimenter will generally have some idea of what he hopes the treatment will achieve. 

* See Appendix, p. 345 below, 
t Giving equal weight to all values of r. 



Ocoao^ic»oi4^cois©t— ctoxoaon^w 


343 


E. S. Peabson and Maxine Mebrington 

He may know, for example, that Y/(Y + Z) is unlikely to exceed 0*25, and yet be olear that 
even if it were no larger than this the introduction of B would be amply justified. Here 
Pigs. 4 and 6 show that, for m = n = 15, were X = 0 (treatment A ineffective*) and 
Y /(Y + Z) = 8/30 = 0-27, the ohance of establishing significance is about 0*66 at the nominal 
6 % level and 0-89 at the nominal 0-15 % level. But if X were 10, or one-third of the group 
ofW = 30 were responsive to the old treatment then, though 5 of the remaining 20 would 


Table 5. Showing for the case m-n — : (i) the critical limits a 0 (a, r, N, m)for nominal 

significance levels a = 0-05 and 0-15; (ii) the true chance that a ^ a n (a, r, N, m) 


r 



N = 

20 




30 



N = 

50 


a = 

0*05 

a = 

= 0-15 

a = 

= 0-05 

a = 

= 0*15 

a - 

= 0-05 

a = 

= 0-15 


True 

«o 

True 

«0 

True 


True 

«0 

True 

« 0 

True 

a o 

chance 

chance 

chance 

chance 

chance 

chance 



0 

0-105 



0 

0-112 



0 

0-117 

0 

0-043 

0 

0-043 

0 

0-050 

0 

0-050 

— 

— 

0 

0-055 

0 

0-016 

0 

0-016 

0 

0-021 

0 

0-021 

0 

0-025 

0 

0-025 

0 

0-005 

1 

0-070 

0 

0-008 

1 

0-084 

0 

0-011 

1 

0-096 

1 

0029 

1 

0-029 

1 

0-040 

1 

0-040 

1 

0-049 

1 

0-049 

1 

0-010 

2 

0-085 

1 

0-018 

2 

0-107 

1 

0-024 

2 

0-123 

2 

0-035 

2 

0-035 

1 

0-007 

2 

0-054 

1 

0-012 

2 

0-069 

2 

0-012 

3 

0-089 

2 

0-025 

3 

0-123 

2 

0-037 

3 

0-144 

— 

— 

— - 

— 

2 

0-010 

3 

0-064 

2 

0-019 

3 

0-085 

— 


— 

— 

3 

0-030 

4 

0-132 

3 

0-048 

3 

0-048 

— 

— 

— 

— 

3 

0-013 

4 

0-070 

3 

0-025 

4 

0-098 

— 

— 

— 

— 

4 

0-033 

5 

0-136 

3 

0-013 

4 

0-057 

— 

— 




4 

0-013 

5 

0-071 | 

4 

0-031 

5 

0-108 

— 









— 

— 

— 

4 

0-016 

5 

0-064 

— 

— 







— 



— 1 

5 

0-038 

6 

0-119 

— 














5 

0*021 

6 

0-072 

— 

— 









— 

— 

6 , 

0-042 

7 

0-124 

— 













— 

6 

0-023 

7 

0-077 

— 

— 









— 

— 

7 

0*044 

8 

0-128 

— 

— 







j 

— 

— 

7 

0-024 

8 

0-079 















— 

8 

0-046 

9 

0-131 














— 

8 

0-025 

9 

0-081 



i 

— 

— 

— 

— 

— 

— 

— 

9 

0-046 

10 

0-131 


For r>\N the true chance for r is the samo as that for A' - r, while a„(a, r, N, m) — r—m+a 0 (oc, N — r,N, m). 


react if given treatment B ( Yi( Y + Z) = 025), the experiment will most probably be in- 
conclusive. This is because the power of the test is now only 0-12 and 0-34, respectively, at 
the 5 and 15 % levels. 

If we regard Y /( Y + Z) as the measure of effectiveness, the existence of X reduces the 
ohance of establishing significance. We need not express effectiveness in this way, but the 
numerical data given in the charts associating power with the partition of N in X, Y and Z, 
makes it possible to express the position in any terms considered more appropriate. 

As a further example, we may take the case where out of N = 30 individuals, X - 10 
would react to treatment A and X + Y = 20 to treatment B. This might well be regarded 


* At least, ineffective on the 30 individuals selected for the experiment. 





344 


2x2 tables 


as an eminently satisfactory result. But Figs. 4 and 5 show that the chance of distinguishing 
this case from one in which Y — 0 is only 0*41 at the 5 % level and 0*74 at the 15 % level.* 
If now we make a rough interpolation in Figs. 6 and 7 between the power curves for X = 10 
and X = 25, it is seen that using 50 individuals for the experiment, then were X * Y - 17 
(keeping the same proportions as before) there would be a chance of about 0*76 of establishing 
significance at the 5 % level and of about 0*92 at the 15 % level. There are likely to be cir- 
cumstances where, with this knowledge, it would rightly be concluded that while a com- 
parative experiment on two sets of 15 individuals would not be worth undertaking, an 
experiment using two sets of 25 would be. 

Another point whioh the diagrams bring out iB the dilemma which faces the medical 
research worker who wishes to establish his conclusions on a scientific basis, but has a strong 
conviction that a new treatment will be effective in reducing pain, hastening recovery or 
even saving life. Dealing with a hospital population of varying susceptibilities and with 
other changes in external conditions from time to time, it may be impossible to make com- 
parisons which could be accepted indisputably between (a) successes in the past, using 
treatment A, and ( b ) successes at the moment using treatment B. Yet if a controlled, com- 
parative test is carried out, it is seen that to provide a conclusive result the number of 
patients in the group Y must be considerable, perhaps 10, 16 or even 20. And yet, on the 
average, £F of these patients will be assigned the old treatment. Even if the belief in treat- 
ment B is based on intuition rather than evidence, considerations of ethics are likely to 
outweigh the urge to plan an experiment which makes a valid scientific comparison possible. 

In conclusion, we freely admit that the presentation given in Figs. 2-7 cannot be regarded 
as a final one. It does, however, provide enough material for the statistician to consider 
whether the approach is practically useful and, hence, whether its extension and possible 
simplification are desirable. 


REFERENCES 

Barnard, G. A. (1947). Biomelrika, 34 , 123. 
Finicky, D. J. (1948). Biomelrika, 35 , 145. 

. Patnaik, P. B. (1948). Biometrika, 35 , 157. 
Pearson, E. S. (1947). Biometrika, 34 , 139. 

Pk arson, K. (1895). Philos. Trans. A, 186 , 411. 
Pearson, K. (1924). Biometrika, 16 , 172. 


* In the experiment with disks referred to on p. 335 above, in 20 partitions out of 50 significance was 
established at the 5% level and in 40 out of 50 at the 15% level, results clearly consistent with these 
theoretical chances. 



E. S. Pearson and Maxine Merrington 


345 


APPENDIX 

Note on the arrangement of D. J. Finney's tabU (Biometrika, 35, 145-56) 

In the present paper we have been mainly concerned with a lower significance level for o, 
for a given value of r = a + b. This limiting value of a is determined by the sum of the tail 
terms in a hypergeometric series. Thus in Fig. 8 below, for m = n = $N = 15, when r = 13,* 
what we have termed the nominal lower 5 % significance level for a is 3 and when r = 12 
it is also 3. This is because the sum of the chances of a assuming values of 0, 1, 2 and 3 on the 
null hypothesis is 0-0127 in the former case and 0-0301 in the latter, while it would be over 
0-06 if we included the chance that a = 4. On this basis, the stepped line indicates the position 
of the significance level for different values of r. 



Fig. 8 

In Finney’s arrangement, when dealing with the single-tail test, ajA > bjB, where A = m 
and B-n. Thus in Fig. 8 the scales of o and b must be reversed. His table is not entered 
with r, but shows for a given a the highest value of b which is just significant at the 5 % level. 
Thus when A = B = 15, for a = 11 he gives b = 5, and for a = 12 he gives 6 = 6. These 
points correspond to r = 16, a = 1 1 and r — 18, a — 12 in the diagram. It will be seen that on 
this basis an entry for the point a = 12, 6 = 5 on the intermediate diagonal with r = 17 is 
unnecessary and space is saved in tabulation. But there is no difference whatsoever in the 
basic calculations leading to Finney’s table and to the special results we have given in 
Table 5 above. 




[ 346 ] 


STATISTICAL ANALYSIS OF A NON-ORTHOGONAL 
TRI-FACTORIAL EXPERIMENT 

By W. L. STEVENS, Universidade de Sao Paulo , Brazil 
1 1. Introduction 

The formal mathematical solution to the problem of ‘fitting constants’ to a set of non- 
orthogonal data presents no great difficulty, but at first sight the arithmetical labour of 
solving the normal equations may appear so great as to discourage the experimenter. It is 
therefore important to show, by suitable worked examples, that the arithmetical procedure 
can be reduced to a routine so that, although the computations will always be lengthy, they 
can at least be seen to follow a simple and recognizable pattern. 

The classical treatment of the problem by the ‘method of least squares’ pays scant 
attention to the formulation of valid tests of significance. When interactions of various 
factors have to be tested for significance, the mathematical solution is likely to appear very 
complex and much less easy to understand than the arithmetic of a worked example. 

Special cases of non-orthogonality, such as those arising from a ‘ missing plot ’ or a missing 
row in a Latin square, have been treated by Fisher, Yates and other writers who have 
produced elegant solutions, but we are concerned here with a breakdown of the ortho- 
gonality restrictions so complete that no special devices are of service. The example to be 
discussed arose from an experiment made by Dr Rocha Faria of Lisbon, and the data were 
kindly placed at the disposal of the writer by the Portuguese statistician, Sr Augusto J. de 
Oliveira. Since it would be difficult even to invent an example more suitable for an 
illustration of the methods, opportunity has been taken to demonstrate the full arithmetical 
procedure necessary to effect an analysis of these data. 

We may begin by reminding the reader of what is meant by ‘ fitting constants’. Suppose, 
for example, that only two factors are being studied — sex and diet, i.e. we have a number of 
animals of the two sexes and we divide them into three groups to be fed on three diets, 
A, B and C respectively, and record the gains in weight during a specified period. If the 
two sexes, even on the same diet, grow at different rates, and if in addition the diets produce 
different effects, it is possible that the expected gain in weight will be given by the formula 

E+E# -f- E d , 

where E is a constant, E s is one of two constants, according to whether the animal is male 
or female, and E D is one of three constants according to which of the three diets is given. 
The observed gain in weight will, of course, differ from the expected gain, owing to un- 
assignable biological variation, errors of weighing, etc., but we are supposing that this 
‘residual’ variation is random and would average zero over a sufficiently large number of 
animals. 

The constants appearing in the above formula may be termed ‘effects ’, since E s measure 
the effect of sex and E D measure the effects of diet. The first constant E may be called, for 
want of a better name, the ‘general effect ’. It will be noticed that they are to some extent 
arbitrary, for if, say, we increase every E D by 5 units and diminish E by 6 units, the answers 
given by the formula will be unaltered. In fact, it is only the differences between the two 
values of E 8 and between the three values of E D which will be measured by an experiment. 



W. L. Stevens 


347 


The ambiguity could, of course, be removed by saying that the sum of the two values of E s 
or of the three of E D shall be zero, but generally we see no advantage in adopting such a 
convention. 


It will also be noticed that the adoption of such a formula implies that the effects of sex 
and diet are additive, i.e. that the sex difference is the same on all diets or (equivalently) the 
dietary differences are the same in both sexes. Such an assumption may or may not be 
justified — it is possible, for example, that males respond best to one diet while females 
respond best to another. In general, if the effects are not additive, the formula required 


would be 


E + E s D , 


where E s D is one of six constants corresponding to the six combinations of two sexes and 
three diets. 


If the first formula is adequate, the two factors, sex and diet, are said to be independent ; 
if the second formula is needed, there is said to be an interaction between sex and diet. It is 
seen that the analysis of the data poses two problems : 

(а) What type of formula is needed? 

(б) What are the best estimates of the numerical values of the constants in this formula? 
The solution of both problems is very easy when the design of the experiment ensures 

orthogonality between sex and diet. For orthogonality, it is not necessary that the 
numbers in all six classes (generated by two sexes x three diets) should be equal ; it is 
sufficient if they obey a simple rule of proportionality illustrated in the following example : 


Numbers of test animals 


Diet 

A 

B 

c 

Total 

Male 

Female 

8 ( = 2x4) 

12 ( = 3x4) 

10 ( = 2x5) 

15 ( = 3x5) 

6 ( = 2x3) 

9 ( = 3x3) 

24 ( = 2x 12) 

36 ( = 3x 12) 

Total 

20 ( = 5x4) 

25 ( = 5x5) 

15 ( = 5x3) 

60 ( = 5x 12) 


When, however, the orthogonality relation no longer holds, there is no simple method of 
testing the significance of interaction and of estimating the required constants. It becomes 
necessary to solve large sets of simultaneous linear equations which is only practicable if 
a routine can be developed which will converge on the solutions through a series of approxi- 
mations. Furthermore, a valid analysis of variance can no longer be constructed by the 
standard methods with which the reader is probably familiar. When, as in the example to 
be discussed, there are more than two factors in the experiment, these difficulties are 
aggravated. 

2. The experiment and data 

The experiment was designed to observe the effect on the growth of guinea-pigs of four 
diets distinguished by the four types of wheat which they included : 

A =5 soft (70%), B = soft (100%), C = hard (70%), D = hard (100%). 

The guinea-pigs were of two sexes and drawn from four litters, so there are three factors : 
sox, diet and litter. 




348 N on-orthogonal experiment 

The data are shown in Table 1, where the complete lack of orthogonality will be apparent. 
Although no particular interest attaches to the difference between litters, yet the effect of 
litter will have to be measured in order that it may be eliminated from comparisons between 
sexes and comparisons between diets. Thus for the purpose of analysis the experiment is 
tri-factorial. 

It is pertinent to describe how, with the material at his disposal, the experimenter might 
have designed an experiment free from the disadvantages which usually attach to non- 
orthognality. From the guinea-pigs used in the experiment, it is possible to make up three 
female and seven male triads, i.e. sets of three of the same sex and from the same litter. 
A further Utter would probably have provided a fourth female and an eighth male triad. 
From the four diets one can construct four sets of three by dropping one at a time. The sets 
of three diets are then aUotted to the triads of animals according to the foUowing scheme : 


Litter 

Males 

Females 

I 

ABC ABC 

ABC 

II 

BCD BCD 

BCD 

III and V 

CDA CDA 

CDA 

IV 

DAB DAB 

DAB 


Within each triad the designated three diets are aUocated at random. 

The experiment described (an example of balanced incomplete blocks) is not orthogonal 
but the departures from orthogonaUty follow a certain regular pattern, as a consequence of 
which the statistical analysis, while not so simple as for an orthogonal design, are neverthe- 
less relatively straightforward and foUow a standard method. 

Other ‘balanced’ designs are available; the one iUustrated was chosen because it gives 
information of high precision on the effects of diets, of sexes and of the interaction between 
diet and sex, while yielding only relatively poor information on the differences between 
Utters. 

Balanced designs have been described in order to emphasize that modem resources are 
quite adequate for dealing with the kind of difficulty which confronted this experimenter, 
and that unbalanced non-orthogonaUty with its attendant compUcations could easily have 
been avoided if he had consulted a statistician before instead of after doing the experiment. 
However, this is no answer to a research worker who has spent time and money on his 
experiment and has discovered too late that better methods are available. The statistician 
must provide a technique for analysing the data and hope that the arithmetical labours 
required wiU be sufficient to discourage the experimenter from ever again disregarding the 
principles of good experimental design. * 


3. Preliminary analyses of variance 

The first stage in the computation, set out in Table 1, is almost self-explanatory. There is 
some repUcation present in the experiment, i.e. animals which belong to the same Utter and 
sex and receive the same diet, and as the repUcates are aU pairs, the ‘error sum of squares’ 
may be found from the sum of squares of the differences divided by two. The sum of squares 




W. L. Stevens 


349 


Table 1. Gains in weight (g.) of thirty-six guinea-pigs and first stage of the computation 




360 


N on-orthogonal experiment 


Table 2. Condensations of the data to two-factor tables, and computation 

of sums of squares 


Diet 

I 

Litter 

II III 

IV 

Total 

Diet 

Sex 

Male Female 

Total 

A 

159 

132 

143 

138 

572 

A 

381 

191 

572 


(3) 

(2) 

(2) 

(2) 

(») 


(«) 

(3) 

(9) 

B 

236 

235 

177 

100 

748 

B 

541 

207 

748 


(3) 

(3) 

(2) 

(1) 

(9) 


(6) 

(3) 

(9) 

C 

161 

213 

180 

179 

733 

C 

462 

271 

733 


(2) 

(3) 

(2) 

(2) 

(9) 


(6) 

(4) 

(») 

D 

178 

239 

213 

185 

815 

D 

598 

217 

815 


(2) 

(3) 

(2) 

(2) 

(9) 


(6) 

(3) 

(9) 

Total 

734 

819 

713 

602 

2868 

Total 

1982 

886 

2868 


(10) 

(id 

(8) 

(7) 

(36) 


(23) 

(13) 

(36) 



Litter 







Sex 

I 

II 

III 

IV 

Total 





Male 

546 

479 

575 

382 

1982 

Note. Numbers in brackets are the 


(7) 

(6) 

(6) 

(4) 

(23) 

numbers of observations contributing 

Female 

188 

340 

138 

220 

886 

to the respective subtotals. 



(3) 

(5) 

(2) 

(3) 

(13) 





Total 

734 

819 

713 

602 

2868 






(10) 

(ii) 

(8) 

(7) 

(36) 





Sums of squares : 















Pairs of factors 



Single factors 




Diets Litters 

Sexes 

Sexes 

Diets 

LitterR 


Litters 

Sexes 

Diets 

231,181 

232,022 

230,172 


234,507 232,970 

235,763 

228,484 

228,484 

228,484 


228,484 228,484 

228,484 

2,697 

3,538 

1,688 


6,023 

4,486 

7,279 

Sums of squares for interactions : 







Diets 



3538 

Litters 

1688 

Sexes 

2697 

Litters 



1688 

Sexes 

2697 

Diets 

3538 

Interaction (by difference) 797 

Interaction 101 ' 

Interaction 1044 

Diets, litters 


6023 

Litters, sexes 4486 

Sexes, diets 7279 




W. L. Stevens 


351 


Table 3. Second analysis of variance ( approximate ) 


Source of variation 

Degrees of 
freedom 

Sum of 
squares 

Mean 

square 

Variance Signifi- 
ratio canoe 

Sex 

1 


2697 


2697 

54*2 

*♦* 

Diet 

3 


3538 


1179 

23*7 

*** 

Litter 

3 


1688 


563 

11*3 

** 



7 


7923 

1132 



2-factor interactions : 








Diet/litter 

9 


797 


88*6 

1*78 

tl.8. 

Litter/sex 

3 


101 


37*0 

0*74 

n.8. 

Sex/diet 

3 


3 044 


348 

6*99 

+* 



15 


1942 




3-factor interaction : sex/diet/litter 


4 


-593 

(?) 



Between combinations 


26 


9272 




Error 


9 


448 

49*8 



Total 


35 


9720 





between combinations of litter, sex and diet is then found by subtraction, thus reversing the 
usual procedure. Here and elsewhere the following code is used for denoting significance : 

*** virtual certainty Pc 0-1% 

** highly significant 0- 1 % < P < 1 % 

* significant 1 % < P < 5 % 

[ 1 ] suggestive 5 % < P < 20 % 

n.8. not significant 20 % < P 


The short analysis of variance shows that some at least of litter, sex and diet must be 
exerting significant effects and we can proceed to examine these in more detail. 

The analysis of variance into the items corresponding to the three primary factors and 
their interactions is a simple extension of the method used for an ordinary orthogonal 
three-factor experiment. The data is condensed (Table 2) into three two-way tables showing 
the subtotals for all combinations of diet x litter, of sex x diet and of litter x sex. Since, as 
a consequence of non-orthogonality, the number of observations contributing to these 
subtotals will vary, the actual numbers should be inserted in brackets below the corre- 
sponding subtotals (see Table 2). 

The sum of squares for any primary factor is found in the way usual when the numbers of 
observations are unequal, e.g. for ‘between litters’ we have 

734 2 819 2 713* 602 2 2868 2 

j 1 j = 1 Doo. 

10 11 8 7 36 


Corresponding to any pair of factors, we may find a sum of squares between all combina- 
tions of the two factors. Thus between combinations of diet and Utter the sum of squares is 


159 2 132 2 143 2 

- - + 1- - — I- , 

3 2 2 


185 2 2868 2 
2 36 


= 6023. 





352 


N on-orthogonal experiment 

From this sum we may remove the sums of squares corresponding respectively to the two 
single factors, diet and litter, and leave something which may be labelled ‘interaction 
between diet and litter’. 

Notice however that, as a consequence of non-orthogonality, the primary factors are not 
wholly independent, and the items labelled ‘interaction’ are therefore not true interactions 
nor even true ‘sums of squares’. This becomes very evident when we calculate the three- 
factor interaction by difference (see Table 3) and find that we have run into Berious trouble 
in respect of both degrees of freedom and ‘sum of squares’. Only 4 degrees of freedom 
remain (although the three-factor interaction should have 3 x 3 x 1 = 9 degrees of freedom) 
and the ‘ sum of squares ’ appears negative ! 

It is evident that the analysis of variance displayed in Table 3 is not genuine, but it will 
nevertheless prove a useful guide to decide what kind of formula will be needed to represent 
the effects of the various faotors. If, therefore, we take the analysis of variance at its face 
value, it would appear that all three primary faotors have significant effects, and that it is 
likely also that a significant interaction will be found between sex and diet. 

We shall therefore begin by finding sets of ‘ constants ’ appropriate to three independent 
primary factors and afterwards examine the interaction between sex and diet. 

4. Effects of three independent factors 
If the three faotors are independent, the expected value of any observation is of the form 

E + Eg + Ejj + Ei jt 

where E is a constant, E s is one of two constants for the two sexes, E D is one of four 
constants for the four diets, and E L is one of four constants for the four litters. We do not 
assume that the constants within any set add to zero, and for practical purposes it is quite 
immaterial if, for example, every Eg is increased by 5 units and every E D diminished 
correspondingly by 5 units. 

The computation of the ‘effects ’ (as we have termed these constants) is set out in Table 4. 
The first step (i) is to note, as a guide in subsequent computations, the breakdown of the 
numbers of observations contributing to the various marginal subtotals. Thus under 
‘Diets’ and against A we find: 

9 = 3 + 2 + 2 + 2 = 6 + 3, 

meaning that there were 9 guinea-pigs on diet A and that these were drawn 3, 2, 2 and 2 
from litters I, II, III and IV respectively, and that they consist of 6 males and 3 females. 

The marginal subtotals are drawn from Table 2 and set out in (ii) of Table 4. The com- 
putational procedure may be regarded as answering the question: ‘What must I add to or 
deduct from each observation, in respect of sex (diet, litter) in order to ensure that the 
means of both sexes (all diets, all litters) shall be equal? ’ 

Thus the means for the two sexes are 

Males 1982 -r 23 = 86-17 
Females 886 13 = 68-16 
18-02 

The average male exceeds the average female by 18-02 g. Two significant figures are suffi- 
cient at this stage, so we shall accordingly deduct 18 from every male observation. The 
individual adjusted observations are not required, and it is only necessary (and easier) to 



W. L. Stevens 363 


Table 4. Calculation of adjustments for three independent factors (first cycle) 


* 

l Composition of the subtotals : 

Sexes 

Diets 

Litters 

(diets) (litters) 

M. 23 = 6 + 8 + 5 + 6=7 + 6 + 6 + 4 
F. 13 = 3 + 3 + 4 + 3=3 + 5 + 2 + 3 

(litters) (sexes) 
A 9 = 3 + 2 + 24-2 = 6 + 3 
B 9=3+3+2+l=6+3 

C 9=2+3+2+2=5+4 
D 9=2+3+2+2=6+3 

(sexes) (diets) 

I 10=7 + 3 = 3 + 3 + 2 + 2 

II 11 = 6 + 5 = 2 + 3 + 3 + 3 

III 8 = 6 + 2 = 2 + 2 + 2 + 2 

IV 7=4+3=2+l+2+2 

• ♦ 

H Original marginal subtotals. Calculate adjustment for sex : 

Subtotals Means Adjustments 

Subtotals 

Subtotals 

1982 86 17 -18 

886 68-15 0 

572 

748 

733 

815 

734 

819 

713 

602 

2868 

2868 

2868 

» * 4 

HI Adjust all subtotals and calculate adjustments for diet: 

Subtotals 

Sub ' Weans Ad i ust " 
totalH MeanS merits 

Subtotals 

1568 

886 

464 51*56 +20 

640 71*11 0 

643 71*44 0 

707 78*56 — 7 

608 

711 

605 

530 

2454 

2454 

2454 

• 

IHJ Adjust all subtotals and calculate adjustments for littor: 

Subtotals 

Subtotals 

®? b : Means Ad j+- 
totals ments 

1646 

925 

644 

640 

643 

644 

654 65*40 + 14 

730 66*36 + 13 

631 78*88 0 

556 79*43 0 

2571 

2571 

2571 



354 


N on-orthogonal experiment 

Table 4 (continued) : Second cycle and final adjustments 


Sexes 

Diets 

Litters 

Subtotals Means Adjustments 

Sub- 

totals Means 

Adjust- 

ments 

Sub- 

totals 

Means 

Adjust- 

ments 

V 

Adjust subtotals for litter and calculate adjustments for sex : 





712 


794 



1822 

79-22 +0*16 

721 


873 



1032 

79-38 0 

710 


631 





711 


550 



2854 

2854 

2864 



vi 

Adjust subtotals for sex and calculate adjustments for diet: 





712*90 79-22 

+ 1*00 

795-12 



1825-68 


721*96 80*22 

0 

873*90 



1032 


710-80 78*98 

+ 1*24 

631*96 





711*96 79*11 

+ 111 

556-64 



2857*08 

2867-68 


2857*68 



vii 

Adjust subtotals for diet and calculate adjustments for litter: 





721*96 


802*82 

80*28 

- 0-28 

1844-54 


721*90 


883*01 

80*27 

-0*27 

1043*29 


721*96 


638*06 

79*83 

+ 0-17 



721*95 


503-34 

80-48 

-0-48 

2887*83 

2887-83 

2887-83 

viii 

Adjust subtotals for litter and calculate final adjustments for all factors: 




719-96 79-990 

0 

800-02 

80-002 

0 

1840-06 

80-003 -0-003 

720-17 80-019 

-0*023 

880-04 

80*004 

- 0-002 

1040-00 

80-000 0 

719-97 79-997 

-0*001 

040-02 

80-002 

0 



719-90 79*990 

0 

559-98 

79-997 

+ 0-005 

2880-06 

2880*00 

2880*00 

ix 

Adjust grand total and calculate general effect: 


* 




Grand total (from viii) 

2880-06 




Adjustment for sex 

- 0*069 




Adjustment for diet 

-0-216 




Adjustment for litter 

+ 0*013 




Final grand total 

2879-788 




General effect (i.e. Inean) 

79-99411 







W. L. Stevens 


355 


Table 5. Collection of adjustments and computation of sum of squares 


Factor 

First 

cycle 

Second 

cycle 

Final 

(viii) 

Total 

adjustment 

Effect 

Subtotals 

(ii) 

Sex M. 

(ii) -18 

( v ) +0-16 

- 0*003 

-17*843 

+ 17-843 

1982 

F. 

0 

0 

0 

0 

0 

886 

Diet A 

(Hi) +20 

(vi) +1-00 

0 

+ 21-000 

-21-000 

572 

B 

0 

0 

-0-023 

- 0-023 

+ 0-023 

748 

C 

0 

+ 1-24 

- 0-001 

+ 1-239 

- 1-239 

733 

D 

- 7 

+ I'll 

0 

- 5-890 

+ 5-890 

815 

Litter I 

(iv) + 14 

(mi) —0-28 

0 

+ 13-720 

- 13-720 

734 

11 

+ 13 

— 0*27 

-0-002 

+ 12-728 

- 12-728 

819 

III 

0 

+ 0-17 

0 

+ 0-170 

- 0-170 

713 

IV 

0 

— 0*48 

+ 0-005 

- 0-475 

+ 0-475 

602 

Note . The small roman numerals indicate from where 
the figures have been drawn 

(ix) 

+ 79-99411 
General 
effect 

2868 

Grand 

total 


Hum of products of effects and corresponding subtotals = 236,355 

X*/n= 228,484 

Sum of squares for effects of factors = 7,871 


Table 6. Third analysis of variance 


Source of variation 

Degrees of Sum of Mean Variance 

freedom squares square ratio Significance 

Effects of the 3 factors 
Interactions 

Between ‘combinations’ 
Residual 

7 7871 1124 22-5 *** 

19 1401 73-7 1-48 n.s. 

26 9272 

9 448 49-8 

Total 

35 9720 


record the effects of these adjustments on all the subtotals. Thus the new male subtotal 
w 1111 * 1982-23(18) = 1568. 

The female subtotal will remain unaltered. The subtotal for diet A is based on 6 males 
(and 3 females) as is shown in (i) of Table 4, and will therefore become 

672-6(18) = 464. 

Similarly, the subtotal for litter I is based on 7 males (and 3 females) and will therefore 
become 734-7(18) = 608. 

The adjusted subtotals are set out in (in) of Table 4. It is noted that agreement between the 
totals of the three sets provides a complete check on the adjustment of the subtotals. 





356 


N on-orthogonal experiment 

Next the four diet means are computed : 

51-56 71-11 71-44 78-56. 

Sinoe two significant figures are sufficient it will be convenient to make no adjustment for 
diets B and C, add 20 for diet A and subtract 7 for diet D : 

51-56 + 20 = 71-56, 71-11, 71-44, 78-56-7 = 71-56. 

These adjustments are carried through into all subtotals as shown in (»v) of Table 4. Thus 
the new male subtotal will be 

1568 + 6(20) + 6(0) + 5(0) + 6( - 7) = 1646. 

The means for the four litters and hence the adjustments in respect of litter are now 
computed, as in (iv) of Table 4, and the adjustments made in (v). This completes the first 
cycle, adjustments having been made in turn for 

Bex — diet — litter. 

The cycle of adjustments (carried to two decimals) may now be run through a second time 
as set out in (»), (vi) and (vii) of Table 4. 

The process described will converge fairly rapidly to a solution, i.e. we shall find that 
after a few cycles the adjustments have become negligibly small. However, we can judge 
from the drop in the magnitude of the adjustments from the first cycle to the second that 
the third cycle adjustments will all be small. The third cycle can therefore be abridged, as 
in (viii) of Table 4, by calculating all means and all final adjustments, to three decimals, 
at once. 

The subtotals, modified by these final adjustments, are not required, but the grand total 
should be adjusted as in (ix) of Table 4. The mean, found on dividing this final grand total 
by n = 36, will be the estimate of E, the so-called ‘general effect’. It should be calculated 
to one or two decimals more than are provided in the final adjustments.* 

In carrying out the adjustments as just described, it is, of course, immaterial on which 
factor we start or in which order we pass round the cycle, but other things being equal, it 
will usually be best to start with the factor producing the biggest effects and work down to 
the one with least effect. It is also immaterial what mean we aim at when applying adjust- 
ments, but the arithmetic is generally simpler if we aim at either the largest or smallest of 
the set so that one of the adjustments (in each set) is zero while the others are all of the same 
sign. However, in the first adjustment for diet ((Hi) of Table 4) advantage was taken of the 
fact that the two intermediate values were by chance approximately the same (71-11 and 
71-44), and two adjustments were thus made zero with a resultant slight simplification of 
the arithmetic. 

The adjustments should now be collected together. For clarity this process has been set 
out in Table 5, but usually they could be picked up and added-straight from Table 4. The 
resultant total adjustments, with their signs changed, will then be estimates of the ‘effects’, 
i.e. of the constants E 8 , E D and E L . They may be reckoned to be correct or nearly correct to 
three decimals. 

* The reason is that the sum of squares to be subsequently computed is near its maximum value (the 
residual sum of squares being near a minimum). The sum of squares is therefore relatively insensitive to 
errors in the estimated constants. Hence even if the estimates are good, say, to only three decimals, they 
may be treated, for the purpose of calculating a sum of squares, as though they were good to four or five 
decimals. 



W. L. Stevens 


357 


In the last column of Table 5 are written the original marginal subtotals against the 
corresponding effects and, at the bottom, the original grand total against the final adjusted 
mean (i.e. general effect). The sum of squares for the effects of the three factors is found by 
adding the products of the last two columns, not omitting the general effect x grand total, 
and subtracting the usual term, X 2 jn = (2808) 2 /36: 

(17*843) (1982) + (0) (886) + ... + (79*99411) (2808) - 228,484 = 7871. 

The number of degrees of freedom to be attributed to this sum of squares is 7, for, although 
10 effects have been estimated, any arbitrary constant could be added to the effects in any 
one of the 3 sets (sex, diet and litter) and taken off the general effect without affecting the 
result, whence the number of degrees of freedom is 

10-3 = 7. 

Notice that sums of squares are not obtained individually for the three factors but only one 
sum of squares for the three acting together. The sum of squares, 7871, is not greatly 
different from that obtained in the approximate analysis of variance, 7923 (see Table 3), 
which suggests that the first four lines of the approximate analysis are all fairly reliable. 

Subtracting the sum of squares for the effects (with 7 degrees of freedom) from the sum of 
squares between all combinations of the 3 factors (26 degrees of freedom) we are left with 
a sum of squares with 19 degrees of freedom representing various undifferentiated inter- 
actions between the factors (see Table 6). The corresponding mean square is larger, but not 
significantly larger, than the residual mean square, and hence there is at first sight no good 
evidence for interaction. However, the approximate analysis of variance (Table 3) did 
suggest that there is a significant interaction between sex and diet, and it is conceivable 
that this interaction, with only 3 degrees of freedom, has been hidden by being diluted with 
other interactions. It is therefore proper to make an exact analysis to test the significance 
of the sex x diet interaction. 


If no restrictions are placed on the form of the interaction between sex and diet, then there 
will be an 'effect’ appropriate to every combination of these two factors. This means that 
together they constitute virtually a single factor whose ‘levels’ are specified by the combi- 
nation, e.g. male on diet A, female on diet C, etc. The data can therefore be treated as 
though they had arisen from a bi-factorial experiment. 

The procedure for calculating the adjustments, as set out in Table 7, does not differ 
greatly from that employed with three factors. Sex x diet is now a single factor, and it is 
only to save paper that the 8 combinations have been laid out in a 4 x 2 table. It is found 
convenient to display the 4 litters horizontally instead of vertically, so that the block of 
numbers under litter I, etc., in ( i ) of Table 7 have the right pattern and orientation for 
calculating the effect of the sex x diet adjustments on the litter subtotal. Thus having 
found the adjustments for the eight sex x diet combinations in (ii) of Table 7, we calculate 
the adjusted subtotal for litter I as follows ; 


734 + 


2 ( 0 ) 1 ( 0 ) 
-2(27)- 1(5) 
-1(29) -1(4) 
- 2(36) 0(9) 


= 570 


• Biometrika 35 


23 



358 N on-orthogonal experiment 

The convergence is not so rapid as in Table 4, and the third decimal is therefore of doubt- 
ful value. 

The sum of squares for effects (calculated in Table 8) has 10 degrees of freedom since 
there are now two sets of effects, one with 8 and one with 4, and 

8 + 4-2 = 10. 

Alternatively we may argue that there are 7 independent comparisons between the 8 sex x 
diet combinations and 3 independent comparisons between the 4 litters. Hence number of 
degrees of freedom is 

7 + 3 = 10 as before. 

The sum of squares (7 degrees of freedom) for the three independent faotors previously 
obtained may now be subtracted from this sum of squares with 10 degrees of freedom to 
leave a sum of squares with 3 degrees of freedom representing the interaction between sex 
and diet, as set out in Table 9. The balance of 16 degrees of freedom to make up the 26 
degrees of freedom between all combinations represents undifferentiated interactions 
between sex and litter, diet and litter and between all three faotors. 

From the analysis of variance (Table 9) it appears that not only are the other interactions 
non-significant, but their mean square is scarcely greater than the residual mean square. 
The variance ratio for the sex x diet interaction is 

F = 196-0/49-8 = 3-936, 

and is therefore just significant at the 6 % level. There is accordingly some moderately good 
evidence of an interaction between sex and diet. 

We may now examine the nature of this interaction by listing the differences, male 
minus female, on the four diets : 


Diet 

Effects (from Table 8) 

Difference 

M. — F. 

Male 

Female 

A 

+ 3*090 

0 

+ 3*090 

B 

+ 29*757 

+ 8*933 

+ 20*824 

C 

+ 29-100 

+ 7-590 

+ 21*510 

D 

+ 37-800 

+ 10-353 

+ 27*447 


The table gives the impression that diet A may be the exception. For A the responses of 
the two sexes are not greatly different, while for B, C and D there is a big difference 
between the two sexes, but this difference may possibly not vary significantly among the 
three diets. On the other hand, the data also suggest the alternative hypothesis that the 
degree of difference between the sexes rises with the response, i.e. that the better the diet, 
the more the males will outstrip the females. It is very necessary to remember that the 
same data may be consistent with several hypotheses which, biologically considered, are 
fundamentally different. It is legitimate, and indeed desirable, to employ a priori biological 
knowledge at this stage even though the knowledge may amount to little more than a hint 
that one hypothesis may be more ‘reasonable’ than another. Often, however, no a priori 



Table 7. Calculation of adjustments for two factors , sex x diet and litter (first two cycles) 


X Composition of the subtotals : 



Sex x diet combinations 


Litters 



Male 

Female 

1 

II 

m 

IV 

A 

6=2 + 24-14-1 

3=l+0+l+l 

10 

11 

8 

7 

B 

6=2+2+l+l 

3=l+l+l+0 

2+1 

2 + 0 

1+1 

1 + 1 

C 

5= 1 + 1 + 2+ 1 

4= 1 + 2 + 0+ 1 

2+1 

1 + 1 

2+1 

1 + 2 

1 + 1 

2 + 0 

1 + 0 

1 + 1 

D 

6=2+ 1+2+ 1 

3=0+2+0+ 1 

2 + 0 

1 + 2 

2 + 0 

1 + 1 


XX Original marginal subtotals. Calculate adjustments for sex x diet: 


Subtotals 


Means 


Adjustments 


Subtotals 


191 

63*50 

63-67 

0 

0 

207 

90-17 

69-00 

-27 

-5 

271 

92-40 

67-75 

-29 

-4 

217 

99*67 

72-33 

-36 

-9 


Totals 


XXX Adjust all subtotals and calculate adjustments for litter: 


381 191 

379 192 

317 255 

382 190 


Subtotals 

Means 

Adjustments 


Totals 


570 

669 

551 

497 

57-00 

60-82 

68-88 

71-00 

+ 14 

+ 10 

+ 2 

0 

2287 


Adjust all subtotals and calculate adjustments for sex x diet: 


431 

207 

71-83 

69*00 

-2-83 

0 

! 710 

779 

429 

218 

71-50 

72-67 

-2*50 

-3-67 



345 

289 

69-00 

72-25 

0 

-3*25 



424 

210 

70*67 

70*00 

-1*67 

-1*00 




v Adjust all subtotals and calculate adjustments for litter : 


41402 20700 

414-00 206-99 
345-00 276-00 
413*98 207-00 


Subtotals 

Means 

Adjustments 


689-08 754-50 554-66 485-75 

68*91 68-59 69-33 69*39 

+ 0-48 +0-80 4-0*06 0 


2483-99 


Totals 


2483-99 


23-a 




360 


N on-orthogonal experiment 


Table 7 ( continued ) : Final adjustments 


VI Adjust all subtotals and calculate final adjustments for both factors: 


Sex x diet combinations 


416*64 207*54 
410*02 208*33 
340*40 278*08 
415*80 208*00 


09*440 09*180 

09*437 09*443 

09*280 09*520 

69*310 69*533 


-0*260 0 
-0*257 -0*263 

-0*100 -0*340 

-0*130 -0*353 


2498*07 


Totals 


Litters 


093*88 763*30 

69*388 69*391 

+ 0*005 + 0*002 


Vll Adjust grand total and calculate general effect: 


Grand total (from vi) = 2498*07 

Adjustments for sex x diet = — 7*590 
Adjustments for litter = 4- 0*080 

Final grand total =2490*560 

General effect (i.e. mean) = 09*18222 


555*14 485*75 

09*392 69*393 

4 - 0*001 0 


2498*07 


Table 8. Collection of adjustments and calculation of sum of squares 


Factor 

First 

cycle 

Second 

cycle 

Final 

(*•*) 

Total 

adjustment 

Effect 

Original 

subtotals 

U-t 

M., A 

(u) 0 

(iv) -2-83 

-0-260 

- 3-090 

+ 3*090 

381 

li 

M., B 

-27 

-2*50 

-0*257 

-29*757 

+ 29*757 

541 

M., C 

-29 

0 

-0*100 

-29*100 

+ 29*100 

462 

*2 r O 

M., D 

-36 

- 1-07 

-0-130 

-37*800 

+ 37*800 

598 

i § 

F., A 

0 

0 

0 

0 

0 

191 


F., B 

-5 

-3*67 

-0*203 

-8*933 

+ 8*933 

207 

I s 

F., C 

-4 

-3*25 

-0*340 

-7*590 

+ 7*590 

271 

u 

F., D 

-9 

-1*00 

- 0*353 

- 10*353 

+ 10*353 

217 


I 

(Hi) +14 

(v) +0*48 

+ 0*005 

+ 14*485 

-14*485 

734 

1 

II 

+ 10 

+ 0*80 

+ 0*002 

+ 10*802 

- 10*802 

819 

3 

III 

+ 2 

+ 0*00 

+ 0*001 

+ 2*061 

-2*061 

713 

IV 

0 

0 

0 

0 

0 

002 


General effect and grand total (vii) +09*18222 2868 


Sum of products of effects and corresponding subtotals = 236,943 

X*/n = 2 28,484 

Sum of squares for effects of factors = 8,459 



W. L. Stevens 


361 


Table 9. Fourth analysis of variance 


Source of variation 

Degrees of 
freedom 

Sum of 
squares 

Mean 

square 

Variance 

ratio 

Significance 

Three independent factors 

7 

7871 

1124 

22*0 

*** 

Sex x diet interaction 

3 

588 

190*0 

3*930 

* 

Three factors and all 

10 

8459 




sex x diet interactions 
Other interactions 

10 

813 

50*8 


n.s. 

Between all combinations 

20 

9272 




Residual 

9 

448 

49*8 



Total 

35 

9720 





knowledge is relevant, and in such cases the greatest care should be taken to avoid adopting 
an hypothesis merely because it is not belied by the data. 

It is, of course, legitimate to test whether the data do or do not contradict any hypothesis, 
always remembering that failure to contradict the hypothesis is not a proof that the 
hypothesis is true. Hence, although it is not necessarily true that the most appropriate 
hypothesis to test is that diet A is exceptional, there can be no objection to making such 
a test as an illustration of the statistical technique. 

The hypothesis to be examined is equivalent to supposing that one particular degree of 
freedom out of the three degrees of freedom for sex x diet interaction has a real effect while 
the other two have no effect. Hence we should separate the mean squares for the one and 
the two degrees of freedom respectively and endeavour to show that the former is and the 
latter is not significant. 

Any algebraic formulation of the process would be cumbersome, but the arithmetic 
proceeds quite straightforwardly once the pattern of adjustments is appreciated (see 
Table 10). We are now permitting a different sex difference in diet A on the one hand and in 
diets B, C and D on the other. Hence the adjustment for sex is made separately in the two 
groups of diets (A) and (B, C or D). To avoid bandying adjustments backwards and for- 
wards between sexes and diets, it is advisable to adjust only the male results. The means of 
the two sexes on diet A only are 

Male 381/6 = 63-60 
Female 191/3 = 63-67 

These are sufficiently close to make no adjustment necessary on the first cycle. The sex 
means calculated from the pooled data for diets B, C and D are 

Male 1601/17 = 94-18 

Female 695/10 = 69-60 

Difference = 24-68 

Hence adjustment is made by subtracting 25 from every male on diets B, C or D, leaving 
unaltered both males and females on diet A and females on diets B, C and D. 




362 N on-orthogonal experiment 

The adjustments to the subtotals are made in (in) of Table 10. Thus for diets 

A 572-6(0) =572 etc. 

and for litters 

I 734-5(25) = 609 eto. 

The final adjustments shown in ( viii ) of Table 10 are appreciable, so the third decimal is of 
doubtful value. 

The number of degrees of freedom in the Bum of squares for effects is now 8, being made 
up of 1 for sex in diet A, 1 for sex in diets B, C and D, 3 for differences between diets and 
3 for differences between litters: 

l + l + 3 + 3 = 8. 

The sum of squares with 7 degrees of freedom obtained with three independent factors 
(see Table 6) may be subtracted from the sum of squares with 8 degrees of freedom, to leave 
a square of 1 degree of freedom which measures the effect of the speoifio interaction which 
we have selected for examination. The balance to bring the sum of squares with 8 degrees of 
freedom up to the sum of squares with 10 degrees of freedom found when all interactions 
between sex and diet were permitted (see Table 9) is a sum of squares with 2 degrees of 
freedom and measures the effect of the interaction between sex and diet other than the 
specific degree of freedom which we selected. The full analysis of variance is set out in 
Table 12. 

It is clear that the hypothesis examined is acceptable on these data. The selected degree 
of freedom in interaction is fairly significant, whereas the mean square for the remaining 
2 degrees of freedom is neither significant nor even much greater than the residual sum of 
squares. In judging the significance of the selected degree of freedom one should, of course, 
allow for the fact that it was chosen as being the biggest of three. However, the three 
together were shown to be on the border-line of significance, so there is little danger that we 
have misled ourselves by selecting a square which is large only by chance. 

6. Presentation of conclusions 

Since litter is independent of the other two factors, any comparison between sexes, between 
t diets or between any combinations of sex and diet may be regarded as generally true of all 
the litters in this experiment. 

There is some evidence that sex and diet are not independent factors, i.e. that the two 
sexes react differentially to the diets. This evidence only just passes the conventional 5% 
level of significance, so the possibility must not be ruled out that an interaction of the 
observed magnitude has arisen fortuitously. It is impossible, on statistical evidence alone, 
to specify at all closely the nature of this interaction, for it is consistent with either of the 
following hypotheses or indeed with many others : 

(a) That the difference between male and female is the same on diets B, C and D but 
different on A. 

(b) That the sex difference increases steadily with the response of either sex, i.e. that the 
‘better’ the diet, the more the male will outstrip the female. 

If the decision to test either of these hypotheses had been made before examining the 
data, the significance of the particular degree of freedom examined would have been 
considered reasonably strong. Of course when, as in the present case, the hypothesis is 
suggested by the data themselves, one must be prepared to discount to some extent the 
impression given by the moderately high level of significance. 



Table 10. Calculation of adjustments for three factors with one degree of freedom in interaction 







Table 10 (continued) 


V 

Adjust all subtotals and calculate adjustments for sex x diet: 



Sub- 


Adjust- 

Sub- 


Adjust- 

Sub- 


Adjust- 

totals 

Means 

ments 

totals 

Means 

ments 

totals 

Means 

ments 

491 

81*83 

-2-83 

728 



812 



237 

79*00 

0 

737 



888 



1368 

80*47 

+ 1*53 

724 



050 



820 

82*00 

0 

727 



500 



2910 

2916 

2910 

vi 

Adjust subtotals for sex x diet and calculate adjustments for diet: 



474*02 



711*02 

79*00 

+ 3-91 

813-99 



237*00 



740*18 

82*91 

0 

888*40 



1394*01 



731*05 

81*29 

+ 1*02 

054*82 



820*00 



730*18 

81*80 

+ in 

567-76 



292503 

2925*03 

2925*03 












vii 

Adjust all subtotals and calculate adjustments for litter : 




497*48 



740*21 



831*18 

83*12 

+ 0-39 

248*73 



740*18 



904*47 

82*22 

+ 1*29 

1408*77 



740*23 



008*10 

83-51 

0 

829*81 



740*17 



581*04 

83-01 

+ 0*50 

2984-79 

2984*79 

2984*79 



♦ ♦ • 









vm 

Adjust all subtotals and calculate final adjustments for all factors: 



501*34 

83*557 

-0-360 

750*90 

83-440 

+ 0*102 

835*08 

83-508 

+ 0-007 

249*02 

83*207 

0 

751*72 

83-624 

+ 0*018 

918*00 

83* ol i) 

0 

1417-38 

83*375 

+ 0-429 

751*88 

83-642 

0 

008*10 

83*512 

+ 0-003 

838*04 

83*804 

0 

751*82 

83-536 

+ 0*000 

584*54 

83*500 

+ 0-009 

3000*38 

3000*38 

3006-38 



» 

IX Adjust grand total and calculate general effect : 


Grand total (from viii) 3000*38 

Adjustment for sex x diet + 5* 1 93 

Adjustment for diet -f 1*134 

Adjustment for litter +0*157 


Final grand total 3012*804 

General effect (i.e. mean) 83*09007 




W, L. Stevens 


365 


Table 11. Collection of adjustments and calculation of sum of squares 


Factor 

First 

cycle 

Second 

cycle 

Final 

(viii) 

Total 

adjustment 

. Subtotals 

Effect {{i) 

Sex in diet A 







M. 

(«) o 

( v ) - 2-83 

-0-350 

-3180 

+ 3-180 

381 

F. 

0 

0 

0 

0 

0 

191 

Sex in diets B, C and I) 







M. 

— 26 

+ 1*53 

+ 0-429 

-23-041 

+ 23-041 

1601 

F. 

0 

0 

0 

0 

0 

695 

Diet A 

(Hi) +10 

(vi) +3-91 

+ 0-102 

+ 14-012 

- 14-012 

572 

B 

+ 7 

0 

+ 0-018 

+ 7-018 

-7*018 

748 

C 

+ 6 

+ 1-62 

0 

+ 7-620 

-7-620 

733 

D 

0 

+ 111 

+ 0-006 

+ 1-116 

-1116 

815 

Litter I 

(iv) + 14 

(mi) +0*39 

+ 0-007 

+ 14-397 

- 14-397 

734 

11 

+ 10 

+ 1-29 

0 

+ 11-290 

-11-290 

819 

III 

+ 2 

0 

+ 0 003 

+ 2-003 

- 2-003 

713 

IV 

0 

+ 0*50 

+ 0-009 

+ 0-509 

-0-509 

602 




General effect (ix) 

+ 83-69067 





Grand total (original) 


2868 

Suin of products of effects and corresponding subtotals = 236,8 17 






X 2 Jn = 228,484 


Sum of squares for effects of factors 


= 8,333 



Table 12. Fifth and final analysis of variance 


Reference 

Source of variation 

Degrees of 
freedom 

Sum of 
squares 

Mean 

square 

Variance 

ratio 

T. 6 

Three independent factors 

7 

7871 

1124 

22-6*** 

Dif. 

Selected sex x diet interaction 

1 

462 

462 

9-3* 

T. 11 

3 factors and a selected interaction 

8 

8333 



Dif. 

Remaining sex x diet interactions 

2 

126 

63 

n.s. 

T. 9 

3 factors and sex x diet interactions 

10 

8459 



T. 9 

Other interactions 

16 

813 

50-8 

n.s . 

T. 9 j 

Between all factorial combinations 

26 

9272 



T. 9 

Residual 

9 

448 

49-8 


i 

Total 

35 

9720 





366 


Non-orthogonal experiment 

If we accept the evidenoe for interaction between sez and diet, the results of the experi- 
ment may be summarized by the estimated effects shown in Table 8. The magnitude of the 
differences between litters are, however, of little interest. Each combination of sex and diet 
may therefore be averaged over the four litters. We have 

69-182 - $(14-485 + 10-802 + 2-061 + 0) - 62-345. 

To this we may add the effect for each diet x sex combination; thus for males on diet A 
we have 62-345 + 3-090 = 65-435. 

It is unnecessary to quote the decimal figures; the calculation was carried to such 
accuracy only to illustrate the arithmetical processes. The conclusions can therefore be 
summarized in the following table : 

Expected, gains in weight (g.) 


Diet 

Male 

Female 

M.-F. 

A 

65 

62 

3 

B 

92 

71 

21 

C 

91 

70 

21 

D 

100 

73 

27 


This table, read in conjunction with the notes on the significance of the sex x diet inter- 
action, summarizes all the information in the experiment. 

7. Discussion of methods 

We have shown that even when orthogonality restrictions have completely broken down in 
a factorial experiment, it is still possible to construct exact analyses of variance and to 
make valid tests of the significance of any effects or interactions or in fact of any form of 
departure from any hypothesis. The reader may, however, have been left with an impres- 
sion that the whole thing is a tour deforce rather than a recognizable method of analysis. 
We regret if such is the impression ; the calculations are long and tedious, but we hope that 
they follow a pattern of which the logic can be appreciated. 

The principle of construction of a valid analysis of variance, in so far as it oan be summed 
up in a few sentences, is as follows: Take any two hypotheses X and Y, of which X is 
admitted and Y is more complicated. Calculate the effects which are specified by X and the 
sum of squares for these effects. Do the same for hypothesis Y. Then the increases in the 
number of degrees of freedom and in the sum of squares as we pass from X to Y will lead to 
a mean square for deciding whether we must adopt the more complicated hypothesis. Thus 
in constructing the analysis of variance in Table 9 the relevant hypotheses were : 

* Degrees of freedom 

X Sex, diet and litter have significant effects but 7 

none of their interactions is significant 

Difference 3 

Y Same as X, except that possibility of interaction 10 

between sex and diet is admitted 

The difficulty which is peculiar to a non-orthogonal experiment is that the effects of 
irrelevant factors, such as litter in the example quoted, have to be computed separately on 



W. L. Stevens 


367 


the two hypotheses, X and Y. In an orthogonal experiment the effects of litters (and hence 
the corresponding sum of squares) would be unaltered by any modification of the hypo- 
thesis concerning the other two factors. This, when translated into general terms, explains 
why the statistical analysis of a non-orthogonal (and unbalanced) experiment will neces- 
sarily be troublesome; in fact, as we have already emphasized, the moral to be pointed is 
that the research worker should either acquaint himself with the principles of experimental 
design or else employ a statistician before embarking on the experiment. 


Summary 

Data are presented from a biological experiment in which no attempt was made to ensure 
orthogonality or even balanced non-orthogonality. The proper design for an experiment 
with the same material is briefly discussed. The methods for the statistical analysis of the 
data are then explained by means of the arithmetical computations laid out in such a form 
that the routine can be understood and hence applied to other examples. 



[ 368 ] 


♦ 

COMPARISONS OF HEIGHTS AND WEIGHTS OF GERMAN 
CIVILIANS RECORDED IN 1946-7 AND ROYAL AIR FORCE 
AND OTHER BRITISH SERIES 

By G. M. MORANT, D.Sc., RjL.F. Institute of Aviation Medicine 
1. Summary and main conclusions 

M. Records of heights and weights collected during the war show clearly that changes 
were taking place in the distributions of the measurements for the British population. 
Compared with pre-war years maximum height, indicating skeletal maturity, was normally 
reached at a younger age than previously. Weights for the war period tended to be above 
pre-war levels for ages under about 25 years and below them for ages over 25. New standards 
were needed which could be used to assess the significance of contemporary and future 
changes in heights and weights. This paper is a contribution to that topic, which is aided by 
comparison of the recent British records with those for large numbers of civilians in the 
British zone of Germany measured for the Public Health Branch of the Control Commission 
for Germany (British Element) from December 1946 to June 1947. 

] -2. Comparisons made lead to the following general conclusions: 

(а) Appreciable differences are found between the mean heights, recorded weights and 
weights standardized for height of the German children of different towns (Hamburg and 
Berlin) and regions ( Ijdnder ). For adults the only relation between the mean weights 
standardized for height which is consistent for both sexes and nearly all age groups is that 
the Berlin series are inferior to all others. 

(б) Pooled series representing all regions except Berlin are treated in later comparisons. 
The data were subdivided to represent earlier (December 1946-Febrnary 1947) and later 
(March-June 1947) periods. None of the secular differences between the mean heights and 
weights of the German children, and between the weights of the adults, are large and it is 
possible that they were due partly to normal seasonal fluctuations which may be exhibited 
by a population having constant nutritional conditions. 

(c) Occupational series of adults show a sequence in mean weights from very heavy 
workers (heaviest) to normal consumers (lightest). 

(d) The total series, excluding Berlin children and adolescents, give growth curves for 
height which are of the usual forms, but which have maxima at the remarkably young 
ages of 19| years for males and 17 years for females. For both sexes the weight curves are 
peculiar in showing less increase for ages over 18 than is normally found. 

(e) For ages under 19 and comparisons of weights reduced at each age to a constant 
height, the post-war German standards occupy low positions, 'but ones which are not 
extremely low compared with those for pre-war German series. 

(/) The maximum of the male age curve for height given by the post-war survey is 
greater than any of the pre-war means for regional sections of the male adult population of 
the British zone of Germany. These pre-war standards were probably underestimates of the 
mean heights of German men. 

(g) The following conclusions ((g)-(j)) are derived from comparisons of mean weights for 
different series reduced to the same height at each age, i.e. for weights compared after 



G. M, Morant 


369 


allowance has been made for differences in height. For adolescent male British industrial 
workers, series measured during the war (1943) were decidedly heavier than series measured 
before the war (1929-32). The British series of both dates are well above the post-war 
German series. For British female industrial workers, series measured during the war show 
a higher standard than pre-war series (1926), the latter being rather lower than the post-war 
German series. 

(A) For men aged 17-40 years the mean standardized weights for twelve British and the 
post-war German series give the sequence: Commonwealth and British serving aircrew 
(heaviest) — aircrew recruits — R.A.F. ground staff — industrial workers, 1943 — industrial 
workers, 1929-32 — Germans — conscripts, including those rejected after examination, 
1917-18 — unemployed industrial workers, 1929-32 (lightest). 

(i) The situation is markedly different for ages over 40. The German series clearly falls 
to the lowest place, and unlike all other series it shows mean weights rapidly declining with 
increasing age: the British wartime industrial workers fall below the pre-war employed 
workers, though remaining above the low levels of the 1917-18 conscripts and pre-war 
unemployed workers. Rationing has affected most markedly the weights of the middle 
aged and aged. 

(j) The few series of women lead to similar conclusions. British industrial workers were 
decidedly heavier during than before the war, and the Germans were clearly at a lower level 
for all except younger ages (up to about 30 years). The evidence suggests that for all ages 
loss of weight due to restricted rations has been greater for females than for males, and 
decidedly greater in the case of both sexes for middle-aged and old people than for young 
adults and children. 


2. The German 1946-7 survey 

2*1. Starting in December 1946, the Public Health Branch of the Control Commission for 
Germany (British Element) has carried out a new survey of the body weights of German 
people in all parts of the British zone, including the British sector of Berlin. The plan was to 
remeasure the same people, as far as possible, at monthly intervals, the weights being taken 
during the first two weeks of each month. They were recorded under the direction of Public 
Health Officers, who aimed, in accordance with guidance received, at obtaining random 
samples of the populations of their districts. Both sexes and all age groups were to be repre- 
sented. The returns were reduced by No. 1 Nutrition Survey Team and Hollerith machines 
were used for the purpose. The writer is indebted to Brigadier W. S. Martin, M.C., Public 
Health Adviser, Health Branch, C.C.G., and to F. D. G. Bailey, Esq., formerly Nutrition 
Officer of the same Branch, for permission to use the German data given in a new form in the 
present paper. 

2*2. The records available consist of a preliminary report on the survey (Control Commis- 
sion for Germany, 1947) with tabulated data for the months December 1946-March 1947, 
and additional tables for the months April-June 1947. Only average values of the measure- 
ments — heights and weights of children and adolescents and weights standardized for 
height of adults — are given. The total material can be divided into two sets, viz. 

(i) Children and adolescents : (a) means of heights and weights as recorded for each month 
December 1946-February 1947 (and for some regions also for March 1947), distinguishing 
sexes, ages (denoted by year of birth), and Berlin, Hamburg and regional samples, the data 



370 


Comparisons of heights and weights of German civilians 

also being given for the total British zone except Berlin; (b) the same data for each month 
March-June 1947 for the total British zone except Berlin only. 

(ii) Adults: (a) mean weights reduced to constant heights (see §3*6) for each month 
December 1946-February 1947 (and for some regions also for March 1947), distinguishing 
sexes, ages (by decennial groups of years of birth), occupational groups, and Berlin, 
Hamburg and regional samples, the data also being given for the total British zone except 
Berlin; (b) the same data for each month March-June 1947 for the total British zone except 
Berlin only. 

The age ranges for the children and adolescent series, on the one hand, and for the 
‘adult’, on the other, overlap. The youngest group for the latter is 7-17 years and data for 
this were not used as it covers too large a part of the age cycle when growth is most rapid. 
Other details regarding the survey are discussed in § 3 below. 

3. Treatment of the German 1946-7 survey data 

3-1. The available statistical data for the German survey consist of means only. Possible 
ways of treating them are thus very limited and rigorous comparisons cannot be made. The 
aim of the comparisons in following sections of this paper is to reveal the salient character- 
istics of the material only. The treatment consists chiefly in deriving clear-cut conclusions 
from graphical comparisons. The means are for series distinguished on account of sex, age, 
locality, time (month of measurement) and occupation (adults only). By appropriate 
grouping of the means, providing pooled means, the situation is examined for each of the 
last three of these factors considered singly, possible interaction between them being 
ignored. This omission is not likely to have given invalid conclusions, since notice is only 
taken of marked distinctions. 

3-2. The aim was to remeasure the same people at monthly intervals, so that information 
would be obtained most economically regarding a secular change in the distributions of 
weights for the populations sampled. For various reasons this was not achieved fully and it 
is said: ‘The number of subjects who were actually weighed in consecutive months is much 
lower than might be expected.’ The number of individuals giving a mean for any particular 
month is known. In pooling data for different months the number of observations is known 
but not the number of individuals involved, because the number of repeated weighings is 
unknown. It may be noted that the n ’ s for means given in the tables below have greater 
statistical value in secular comparisons and lesser value in regional and occupational 
comparisons. 

3*3. The original grouping is by calendar years of birth, not years of age. When translated 
into years of age, with regard to the date or dates of measurement, the central values of 
groups are fractions of years. In comparing the data for children \vith other series it is more 
convenient to have means for even central values of the form x-0 or x-5 years. Estimates 
at these ages were obtained for the German and other series when necessary by Unear inter- 
polation, which can be supposed sufficiently precise when data at 1-year intervals are being 
treated. Consideration of the following points is necessary. 

3-4. Seasonal fluctuations in height and weight increments. The German data are for the 
months December 1946-June 1947, measurements having been recorded during the first 
two weeks of each month. It is convenient to make comparisons between pooled means for 



G. M. Mobant 


371 


the two periods December 1946-February 1947 and March-June 1947. In doing this, 
differences observed might be partly due to normal seasonal changes in growth rates for 
height and weight in the case of immature individuals, and to normal seasonal fluctuations 
in the weights of adults. The same question has to be taken into account in comparing the 
German series as a whole with others. Data regarding seasonal changes of heights and 
weights in children have been given by Schmid-Monnard (1895), Mumford (1927), Friend 
(1935) and Friend & Bransby (1947), and for adolescents and adults by Kemsley (1945-7). 
It has been found for schoolchildren that growth at all times of the year is greater during 
holidays than during term time. The evidence of different series is not entirely in agreement, 
but it justifies some rather vague general conclusions. These are that seasonal fluctuations 
in the growth rate of height in boys (and presumably girls, too) are probably so small that 
they may be neglected in comparing different series; that they may be appreciable for 
weights of children — giving maximum differences between quarterly averages for the same 
population of the order 2 lb. — and that they are probably negligible for weights of adults. 

3*5. Allowances for clothes. In comparing the heights and weights of different series of 
children and adults, allowance has to be made for differences in the clothing worn by the 
subjects when measured. The allowances are required to adjust average values, and there 
can be no assurance that they are precisely correct when applied to a particular series. 
Relevant information is : 

(а) With reference to a survey of male industrial workers: ‘Height was taken without 
shoes, but with very thin heel-less slippers. Weights were taken with ordinary clothes on, 
8 lb. being deducted from the gross weights registered, this value being the average of 
repeated weighings of male clothing’ (Cathoart, Hughes & Chalmers, 1935). 

(б) With reference to a survey of female industrial workers : ‘ Weight was taken without 
shoes, light heel-less slippers being provided.. . .Assessment was made of the weight of 
clothing worn by the average worker, and as a result 4 lb. was deducted from the gross 
weights obtained’ (Cathcart, Bedale, Blair, Macleod & Weatherhead, 1927). 

(c) The following means weights in kg. (1 kg. = 2-2 lb.) are given for the clothes of Alsatian 
boys measured in the spring (Schlesinger, 1917) : 


Age 

7 

8 

9 

10 

11 

12 

13 

14 

Coat 

0-2 

0-2 

0-25 

0*3 

0*4 

0*5 

0*0 

0*0 

‘Trousers* 

0-26 

0-25 

0-3 

0*35 

0*4 

0*4 

0*5 

0*0 

Underclothes 

0-25 

0*25 

0-3 

0-3 

0*3 

0*3 

0*3 

0*3 

Boots 

0-5 

0*0 

0-7 

0-7 

0-8 

0*8 

0*9 

0*9 

Total 

12 

1*3 

1*55 

! 

1*65 

1*9 

2*0 

2*3 

2*4 


(d) Measurements of R.A.F. aircrew clothed and unclothed (taken in December) give the 
following data: 

Average exoess of height in socks over height nude = 0*27 in. (520). 

Average excess of weight clothed, without tunic or boots and with trouser 

pockets empty, over weight nude = 4-9 lb. (520). 

Average weight of tunics = 2*5 lb. 

Average weight of boots = 3-5 lb. 



372 Comparisons of heights and weights of German civilians 

Measurements of the German 1946-7 survey are recorded for subjects in indoor clothing 
and without shoes. For various reasons some of the subjects were actually measured in 
other conditions, and the following corrections were applied to adjust such readings so that 
all records would be for subjects wearing the prescribed clothing (indoor clothing and 
without shoes) : 


Apparel 

Correction for 
height (cm.) 

Correction for 
weight (kg.) 

Males 

Females 

Males 

Females 

Shoes 

-2 

-3 

-10 

-0-5 

Overcoat and shoes 

-2 

-3 

-3-5 

-20 

Heavy working dress 

-2 

— 

-5-0 

— 

Religious sisters dress 

— 

-2 

— 

-3*0 

Underwear only 

— 

— 

+ 0-7 

+ 0*7 

None 

| — 

— 

+ 10 

+ 1*0 

■ 


It is not stated that these estimates were obtained (as average values) by actually 
measuring subjects wearing different sets of apparel, or by weighing garments. Most of 
them are in fairly close agreement with earlier estimates, but the additions to be made to 
weights when either underwear only or no clothing was worn are certainly too small. 

In comparing different series mean heights and weights were adjusted when necessary by 
referring to the data given above (a)-(d) and choosing arbitrary corrections for clothing 
suitable to the circumstances. For children and adolescents the reduction is to measure- 
ments in indoor clothing and without shoes, so the German 1946-7 survey means are 
accepted as given. For weights of adults the reduction is to nude measurements, since most 
of the series give records for this condition, and the German means are adjusted accordingly. 
There is no assurance that allowances made for clothes are very close approximations to the 
unknown values in all cases, but errors on this account are not likely to be greater than 
about l in. (6 mm.) for height and 1 lb. (0*45 kg.) for weight. 

3*6. Reduction of weights to constant heights . In comparing the mean weights of two series 
it is necessary to make proper allowance for differences between their mean heights at 
different ages. This is best done for each age by using the regression coefficient of weight on 
height (giving increase in mean weight for unit increase in mean height), and reducing the 
mean weights of one of the series to values expected if its mean heights were those of the 
other series. Two sets of weights, one set estimated, for people having the same average 
heights are thus compared. Regression coefficients provided by the following British series 
were examined, all samples being for adequate numbers of individuals : 

(а) male industrial workers, 1929-32, age groups 16 to over 60 years (Cathcart et al . 1935) ; 

(б) female industrial workers, 1926, age groups 15 to over 50 years (Cathcart et aL 1 927) ; 

(c) Glasgow children divided into four groups representing different social grades, age 
groups 6-13 years (Elderton, 1914); 

(d) East Sussex children, age groups 5-14 years (Dunstan, 1925) ; 

(e) R.A.F. aircrew, 1944, age groups 20-40 years (unpublished). 

The regression coefficients for these series are in good agreement, and there are clear 
differences between the values on account of sex, age and social class. In general they in- 





G. M. Morant 


373 


oceaae with increase in mean weight. The age corves provided by the coefficients for each 
sex, and the relations between the curves for the two sexes, are very similar to typical age 
curves for weight (e.g. Fig. 8), female values only exceeding male for some adolescent ages. 
Series representing higher social classes tend to have greater coefficients for corresponding 
ages than series representing lower social classes, but differences of this kind are relatively 
small. 

In practice it is not essential that regression coefficients used for the purpose considered 
should be derived from either of a pair of series compared, or that values used should be 
very close approximations to such ideal data. This is so because differences between the 
mean heights of the series at corresponding ages — the factor by which the regression 


Table 1. Data used in reducing weights to constant heights 


Age group (years): 
central values 

0*5 

7*5 

8*5 

9*5 

10*5 

11*5 

12*5 

Accepted regression 
coefficients, x 
(1 cm. = # kg.) 

Males 

0*27 

0*29 

031 

033 

036 

0*40 

0*44 

Females 

0*25 

0*27 

0*29 

0*31 

035 

0*40 

0*46 

Heights (cm.) to which 

Males 

118 

122 

127 

131 

136 

140 

144 

weights were reduced 

Females 

117 

121 

120 

130 

135 

141 

147 








■MHMI 

Age group (years) : 
centra! values 

13-5 

14-5 

15*5 

1 

10*5 

17*5 

18*5 

All 

adult 

ages 

Accepted regression 
coefficients, x 
(1 om. = x kg.) 

Males 

0-49 

0*55 

0*00 

0*03 

0*65 

0*65 

0*65 

Females 

051 

0*53 

054 

0*55 

0*55 

0*55 

0*55 

Heights (cm.) to which 

Males 

149 

155 

102 

108 

171 

173 


weights were reduced 

Females 

152 



157 

101 

101 

161 

161 

S 


coefficients have to be multiplied — will nearly always be small. The reduced weights given 
in this paper were obtained by using a constant set of regression coefficients, these having 
been made up by balancing the evidence of the series listed above. 

In comparisons with the German 1946-7 survey, the position is different for the immature 
series, on the one hand, and the adult on the other (see § 2*2). In the case of the former, 
absolute mean heights and weights are available for each year of age. It is convenient (at 
each age) to reduoe the mean weights of all other series to the mean heights of the German 
series, these mean heights being given, together with the regression coefficients used, in 
Biometrika 35 34 














374 


Comparisons of heights and weights of German civilians 

Table 1. In the case of the adults no absolute constants are available, but only mean 
weights reduoed to the heights of 1085 mm. for men and 1575 mm. for women. The reduc- 
tion is said to have been made for all age groups by applying the allowances (1 cm. in 
height) = (0-65 kg. in weight) for men, and (1 cm.) = (0*55 kg.) for women. These are 
presumably regression equations derived from correlation tables for height and weight, and 
they show good agreement with equations given by British series of adultB. Accordingly, 
for all adult ages the German equations, and the heights to which the weights were reduoed, 
were adopted. It may be noted that these heights (1686. mm. for men and 1575 mm. for 
women) are probably rather lower than the averages for German adults (cf. Fig. 7), so the 
reduced weights provided are probably rather less than the aotual averages for the people. 
In comparisons where weights were adjusted the data given in Table 1 were used. 

In view of the loss in precision — due chiefly to allowances which have to be made for 
clothes, and partly to the method of reduction to constant heights — no significance should 
be attributed to small differences between mean weights for different Beries. A difference 
greater than 2 lb., say, is almost certainly real, but a difference of 1 lb. may not be. 


4. The heights and weights of Germans recorded in 1946-7 

4-1. Regional comparisons. Comparisons can only be made for the months December 
1946-March 1947 (children), or December 1946-February 1947 (adults). Pooled means 
were calculated for these periods, restricted to the years of age for which the data are most 
adequate, and for separate regions. The mean heights and weights for boys are plotted in 
Figs. 1 and 2, and the diagrams for girls give a similar impression of regional distinctions. 
Assuming that the order of variation was not exceptional, it can be estimated, by using 
standard deviations for other series, that most of the differences between the German means 
at corresponding ages are statistically insignificant, but some of the differences are almost 
certainly markedly significant. Taking weighted mean differences for ages between 6 and 
16 years suggests the following order for both boys and girls: 

North Rhine — Westphalia — Berlin — Hamburg— Hannover — Schleswig-Holstein, 
(tallest) (shortest) 

The weights as recorded give the similar orders: 

Boy 8: North Rhine — Westphalia — Hamburg — Berlin — Hannover — Schleswig-Holstein. 

(heaviest) (lightest) 

Girls: North Rhine — Hamburg — Westphalia. — Schleswig-Holstein — Hannover — Berlin. 

The orders given by the mean weights are changed substantially when allowances are 
made for differences between the mean heights. The reduoed weights, obtained by the 
method described in § 3-6, give: 

Boy 8: Schleswig-Holstein — North Rhine — Westphalia — Hamburg — Hannover — Berlin. 

(heaviest) (lightest) 

Girls: North Rhine — Schleswig-Holstein — Hambtirg — Westphalia — Hannover — Berlin. 

For both boys and girls the differences between the extremes here are small (less than 
1 kg.) for the younger ages and decidedly larger (several over 2 kg.) for ages over 12 years. 



North Rhine 




24-2 


Age (years) A S* (years) 

Fig. 1. Average heights for regional series of German boys, Fig. 2. Average weights (indoor clothing and without shoes) for 

December 1946-March 1947. regional series of German boys, December 1946-March 1947. 





370 Comparisons of heights and weights of German civilians 

Reduoed mean weights for the adults are shown in Fig. 3 and they suggest the following 
sequences: 

Men: Hannover, Hamburg and Schleswig-Holstein— Westphalia — North Rhine — Berlin, 
(heaviest and no appreciable differences (lightest) 

between levels) 

' Women: Westphalia — North Rhine, Hamburg and Schleswig-Holstein — Hannover — Berlin. 

(no appreciable differences) 



fig. 3. Average weights (indoor clothing and without shoes) standardized for height of 
regional series of German men and women, December 1946-February 1947. 




G. M. Morant 


377 


Excluding the Berlin aeries, the maximum differences between the reduced mean weights 
for the five other regions are of the order 2 kg. The only relation between the weights 
standardized for height which is consistent for both sexes and nearly all ages is that the 
people of Berlin were inferior to all the others. Regional differences between the other 
series are smaller in value and clearly of less significance. 

4*2. Secular comparisons: heights of children. The total series considered here is that for 
the British zone of Germany excluding Berlin. The numbers of observations made in each 
month are roughly the same for the boys and girls and of the orders 10,000 for December 
1646, 4000 for January 1947 and 11,000-13,000 for each of the months February-June 
1947. A comparison of the mean heights for separate months failed to reveal any consistent 
secular trend. To obtain larger samples the seven months were divided into the two groups 



December 1940-February 1947 and March-June 1947, and the pooled means suggest the 
following conclusions : 

Boys. Ignoring the two youngest ages for which the numbers are small, for ages under 
13 years the two ‘curves’ are interlaced and the maximum divergence between them is 
about 16 mm. For ages over 13 (Fig. 4) the curves are separated and the divergence 
reaches a maximum of 26 mm. (1 in.) between ages 16 and 17 years, which is estimated to 
be markedly significant. 

Girls. For ages under 13 the curves are interlaced and the maximum divergence is about 
10 mm. For ages over 13 (Fig. 4) the means tend to be greater for the later period, but the 
maximum divergence is only 10 mm. and this may not be significant. 

The general conclusion for both boys and girls is that there is no clear distinction between 
the growth rates for height of the population in the two periods for ages under 13; for ages 
over 13 there is indication, which is clear for boys and less clear for girls, that growth in 
height was faster in the later than in the earlier period. The distinction may be due partly 
to a normal seasonal fluctuation in the growth rate (see § 3*4). 

4*3. Secular comparisons : weights of children. Comparisons of the age curves for the two 
periods leads to conclusions for weight similar to those for height. There are no dear 
distinctions for ages under 13. For ages over 13 (Fig. 6) the boys tend to be heavier for the 




378 Comparisons of heights and weights of German civilians 

later period, the ma ximum divergence of the curves being about 1*4 kg. (3 lb.), but the girls 
tend to be practically the same or lighter for the later period. These comparisons relate to 
the actual weights recorded. Standardizing the means to the same height at eaoh age, by 
the method described in § 3*6, has the effect of lessening the divergence of the weight curves 
_ and reversing the sign of the differences for most ages. The amounts in kg. by which the 
mean weights for the earlier period exceed those for the later after allowance has been made 
for differences in mean heights are : 


Central value 
age group (years) 

13-5 

14-5 

15*6 

16-5 

17-5 

Boys 

— 2*4 

+ 0-4 

+ 08 

+ 0-3 

-01 

Girls 

+ 0-5 

+ 0-6 

+ 05 

+ 0-9 

+ 1-2 


Interpreted in this way, the tendency was for children of the same age and height to be 
lighter in the later period, but the differeuoes are small and of less significance because other 
evidence suggests that weight increments for children are normally rather greater during the 
winter than during the summer. 

4-4. Secular comparisons: weights of adults. As for the children, comparisons are made 
between the two periods December 1946-February 1947 and March-June 1947. The data 
are mean weights standardized to constant heights. For males above age 40 the means are 
appreciably greater for the latter period: for females the same is true for ages over 70, but 
for ages about 30-70 the position is reversed. The distinction here is of more significance in 
view of the fact that weights of adults are believed to be normally rather greater in the 
winter than in the summer. 

4-5. Comparisons between occupational series. The means are plotted in Fig. 6. The 
sequence: Very Heavy Workers (heaviest) — Heavy Workers — Miners — Moderately Heavy 
Workers and Normal Consumers (lightest) is clear for the men. For the women an order is 
less clear, but Normal Consumers tend to be appreciably lighter than Workers. The 
distinctions between the series must, of course, be influenced largely by processes of 
selection. For example. Very Heavy Workers (men) were as heavy for older as for younger 
age groups, which is not true for the total population (see Fig. 20). The distinction may be 
due to better feeding of the subgroup, or to the fact that middle-aged men below its average 
in physique and weight were liable to leave it because they were unable to continue heavy 
work, or both these factors may have been involved. 

4*6. Age curves for the total German series of children and adolescents. (Table 3, Figs. 7, 8.) 
The series referred to are for the total British zone except Berlin. The age curves for height 
are almost coincident for ages under 11 years: for ages 11-16 the girls have the greater 
averages, this being the time when a clear puberty dip is shown by the curve for the boys. 
The curves provide evidence of the attainment of skeletal maturity at surprisingly young 
ages. The maximum average heights for males is seen to have been reached by 19-6 years, 
and for females by about 17 years. Extensive records for British men show that for the 
general population the maximum in question during the latter half of the nineteenth 
century was about 26 years, and that in the present century it tended to move to a younger 
age, being about 20 years to-day. The change is presumed to have been due to the better 
feeding of children. For some years before 1945 the German general population may have 
been fed better than the British, resulting in the attainment of maximum height, on the 





G. M. Mobant 


379 



* A figure in brackets is the number of observations on which the mean is based and some repeated weights of the same people are included, 
t Allowing for clothes 3*5 kg. (7*7 lb.) for men and 2 kg. (4*4 lb.) for women. 











380 


Comparisons of heights and weights of German civilians 


64- 
63- 
62- 
_ 61- 
£ 60- 
f 59- 
I 58- 
! 57 - 

C 

1 56- 

f> ss- 

2 54- 
53- 
52- 
51- 

sa 


MEN 

Weighu standardized to halght 1685 mm. 

Very heivy 
workers 



GU . 
X*.. 


Moderately 
heavy workers 


WOMEN 

Weighu sundardlzed to height 1575 mm. 

-Very heavy 
workers 

* 


Moderately N 
heavy workers -*\ 




M25i 


r120 


r115 


Age (points central values for 10-year groups) 


Fig. 6. Average weights (indoor clothing and without shoes) standardized for height of occupations 
series of German men and women. (British zone except Berlin, December 1946-June 1947.) 



Fig. 7. Average heights of German males and 
females. (British zone except Berlin, December 
1946-June 1947; data in Table 3.) 



Fig. 8. Average weights (indoor clothing and 
without shoes) of German males and females. 
(British zone except Berlin, December 1946- 
June 1947; data in Table 3.) 





Table 3. Mean heights and weights of German boys and girls for the 
total British zone except Berlin 

A. Measurements as recorded in indoor clothing and without shoes 


Age (yean): 

No. of measurements* 

Mean height (mm.) 

Mean weight (kg.) 

central values 








Boys 

Girls 

Boys 

Girls 

Boys 

Girls 

0*7 

153 . 

135 

676-7 

668-3 

9-03 

8*49 

1*7 

116 

71 

814*9 

798-6 

11*77 

11-11 

2*7 

648 

448 

907-7 

914*7 

13*94 

13*26 

3-7 

1,055 

1,076 

0855 

984*1 

15*84 

15*34 

4*7 

1,451 

1,407 

1051-7 

1040*7 

17*54 

17*12 

5*7 

1,673 

1,444 

1099*1 

1101*2 

18*87 

18*29 

6*7 

3,714 

3,112 

1184-9 

1174-6 

21*67 

20*79 

7*7 

8,868 

8,143 

1221*9 

1219*5 

23*23 

22*39 

8*7 

8,390 

8,569 

1269-8 

1264-9 

25*35 

24*55 

9-7 

0,523 

9,420 

1317*5 

1315-0 

27*83 

27*01 

10*7 

10,879 

11,414 

1362*6 

1360-2 

30*28 

29*74 

11-7 

10,246 

11,048 

1408*2 

1413-7 

32*81 

32*83 

12-7 

8,752 

9,057 

1447-5 

1472-1 

35*71 

36*66 

13*7 

5,167 

4,896 

1498*1 

1534-1 

38*28 

40*92 

14*7 

3,277 

3,076 

1558-5 

1581-0 

44*48 

46*23 

16*7 

1,750 

1,436 

1627-5 

1609-2 j 

50*95 

50*48 

16*7 

834 

547 

1689-1 

1610-6 

56*20 

51*32 

17*7 

452 

416 

1719-3 

1621-5 

59*87 

53-81 

18-7 

119 

216 

1720-3 

1606-2 

61-25 

54-04 

19*7 

74 

57 

1774-4 

1598-2 

64-50 

53*24 

20*7 

33 

32 

1755-8 

.1622-7 

62-40 

52*18 

21-7 

19 

— 

1750-1 i 

— 

59*47 

— 

22*7 

43 

— 

1763-0 

— 

65*27 

— 

23*7 

37 

— 

1755-8 

— 

65*76 

— 

24*7 

44 

— 

1753*9 

— 

64*58 

— 

25-7 

43 

— 

1750-2 

— 

62*59 

— 

Totals 

77,360 

76,020 

— 

— 

— 

— 


* These are the numbers of observations on which the means are based and some repeated measure* 
ments of the same individuals are included. 


B. Estimated nude weights reduced to height 1685 mm. for males and 1575 mm. for females 


Age (years) : 
central values 

Malesf 

Females]: 

16*7 

52*93 (834) 

47*61 (547) 

17*7 

54*39 (452) 

49*25 (416) 

18*7 

55*46 (110) 

- 

50*32 (216) 

19*7 

1 

56-55 (74) 

49*96 (57) 


t Allowances for clothes 3 kg. (6*6 lb.) for age group 16*7, 3-25 for 17*7 and 3*5 for groups 18*7 
and 19*7. 

x Allowances for clothes 1*76 kg. (3*9 lb.) for age group 16*7, 2 for 17*7, 18*7 and 19*7. 




382 


Comparisons of heights and weights of German civilians 

average, at a still younger age. Restrictions thereafter would not be expected to modify 
the form of the age curve appreciably until after a lapse of some years. 

The weight curves cross at about ages 1 1*5 and 15*5 years. Compared with pre-war data, 
both are peculiar in showing a marked decline in weight increments after age 18. 

5. Comparisons of the post-war and earlier German series 

5*1. Records of the heights and weights of German children living in the British zone of 
Germany are fairly abundant for years before 1939 and a few of the series were selected for 
comparison here. They are for: 

(а) Boys and girls measured in Berlin about 1900, data for 'higher schools ’ (Gymnasien) 
and 'lower schools’ (Gemeindeschvlen) being given separately (Rietz, 1903). 

(б) Boys and girls in a Berlin orphanage measured in 1919 (Davidsohn, 1919). 

(c) Boys and girls (heights only) measured in ‘lower schools’ ( Volkschiden ) in Kiel and 
other towns in Holstein in 1902 (Ranke, 1905). 

(< d ) Boys only measured in ‘higher schools’ in Hamburg about 1878 (Kotelmann, 1879, 
quoted by Ranke, 1905). 

Means for these series are given in Tables 4 and 5 and those for the Berlin children are 
plotted, together with the means for the 1946-7 series, in Figs. 9-12. 

5*2. As expected, the age curves for the pre-war Berlin series make clear distinctions 
between the social classes represented. It is surprising, however, to find that the curves for 
the general population in 1946-7 indicate a standard between those of the earlier ‘higher 
school’ children, on the one hand, and of the ‘lower school’ and orphanage children, on the 
other. This is so for both heights and weights of both boys and girls. The present-day 
Schleswig-Holstein series is also superior to earlier ‘ lower school * children of Holstein in the 
case of heights of boys and girls; and for Hamburg series the 1946-7 means exceed those of 
‘higher school’ boys in 1878. The standards of the post-war German children are evidently 
not markedly depressed. 

5*3. The average w eights are best compared by reducing them at each age to a constant 
height (see § 3*6). The heights used for this purpose are those of the total 1946-7 series for 
the British zone except Berlin, and the weights of all the pre-war German series reduced to 
these heights are given in Table 7 and plotted in Figs. 17 and 18. In general the post-war 
series show standards of weights, after allowance is made for differences in height, which do 
not compare unfavourably with those of pre-war lower social grades of the German popula- 
tion (see also § 6*3). 

5*4. The writer has been unable to find any records of large pre-war series of German 
adults suitable for comparison with the 1946-7 data. Maps compiled by anthropologists 
(Coon, 1939, and earlier authorities) show average heights for the male adult populations of 
different parts of the British zone of Germany ranging from 168 to 173 cm. The mean for 
the region in 1946-7 is about 176 cm. (Fig. 7). A probable explanation of the superiority of 
the later estimate is that it represents the .maximum of the age curve (reached to-day about 
19£ years), whereas the earlier estimates were derived mainly from data for conscripts (ages 
about 18-20) at times when the maximum of the age curve was normally not attained until 
some age near 25 years. It is known for the general British population that the age in 
question changed from about 26 years in the latter half of the nineteenth century to about 
20 years to-day, while the maximum mean height attained remained unchanged in the past 
hundred years. 



Table 4. Average heights (without shoes) in mm. for pre-war series of German children 


G. M. Morant 


383 






Table 5. Average weights (in indoor clothing and without shoes) in kg. 
for pre-war series of Oerman children 



* Nude weight given: allowances made for clothes as in table in § 3(5c. 



G. M. Morant 


385 



Fig. 9. Average heights for series of boys 
measured in Berlin. (Data for 1946-7 series 
in Table 3, and for other series in Table 4.) 



Fig. 10. Average weights (indoor clothing and 
without shoes) for series of boys measured in 
Berlin. (Data for 1946-7 series in Table 3, and 
for other series in Table 5.) 



Fig. 11. Average heights for series of girls 
measured in Berlin. (Data for 1946-7 series 
in Table 3, and for other series in Table 4.) 



Fig. 12. Average weights (indoor clothing and 
without shoes) for series of girls measured in 
Berlin. (Data for 1946-7 series in Table 3, 
and for other series in Table 6.) 




386 


Comparisons of heights and weights of German civilians 


6. Comparisons op the post-war and earlier German and 
British series op children and adolescents 

6-1. The mean heights and weights of recent British series given in Table 6 relate to: 

(а) Boys’ boarding schools assigned to the ‘lowest economic standing’ among boarding 
schools (group C), measurements being recorded in 1936-8 (Friend & Bransby, 1947). 

(б) Male and female industrial workers included in a survey carried out by the Ministry 
of Food in 1943 (Kemsley, 1946). 

(c) Male industrial workers included in a survey carried out by the Industrial Health 
Research Board of the Medical Research Council in 1929-32 (Cathoart et cU. 1936). 


Table 6. Average heights and weights for series of British children and adolescents 



Height without shoes (mm.) 

Weight in indoor clothing 
and without shoes (kg.) 

Age 







(years): 

central 



Boys 




values 

Boarding 

Industrial 

Industrial 

Boarding 

Industrial 

Industrial 


schools 

workers 

workers 

schools 

workers 

workers 


1936-8 

1943 

1929-32 

1936-8 

1943 

1929-32 

11 

1427 (284) 


_ 

350 


_ 

12 

1443 (563) 

— 

— 

36-8 

— 

— 

13 

1504 (888) 

— 

— 

40*9 

— 

— ■ 

14 

1505 (1537) 

1558 (222) 

— 

45-9 

45-9 

— 

15 

1028 (1573) 

1580 (806) 

1540 (139) 

51-7 

49- 1 

44*9 

10 

1704 (544) 

1037 (1310) 

1603 (206) 

58-9 

540 

50-7 

17 

1745 (412) 

1673 (1560) 

1656 (295) 

63-0 

57-7 

55-8 

18 

1750 (152) 

1684 (1240) 

1686 (329) 

65- 1 

59-9 

59-2 

19 


1693 (729) 

1691 (335) 

— 

01-1 

00*6 


Girls 



Industrial 

Industrial 


Industrial 

Industrial 


— 

workers 

workers 

— 

workers 

workers 



1943 

1926 


1943 

1926 

14 


1545 (189) 



46-5 


15 

— 

1557 (698) 

1557 (162) 

— 

49-5 

45*8 

16 

— 

1575 (1332) 

1506 (213) 

— 

51-7 

48*1 

17 

— 

1679 (1022) 

1570 (257) 

— 

52*8 

49*5 

18 

— 

1579 (1620) 

1582 (259) 

— 

53-5 

52*0 

19 

— 

1582 (1377) 

1585 (209) 

— 

540 

52*7 


(d) Female industrial workers included in a survey carried out by the same Board in 
1926 (Cathcart et al. 1927). 

6-2. The age ‘curves’ for these British and for the 1946-7 German (British zone except 
Berlin) series are shown in Figs. 13-16. For heights and weights of boys the German curves 
fall between those for the British pre-war boarding schools and industrial series, and the 




G. M. Morant 387 



Age (year*) 

Fig. 13. Average heights for three British and 
a German series of boys. (Data for British 
series in Table 6, and for the German in Table 3 . ) 



Fig. 14. Average heights for two British and 
a German series of girls. (Data for British series 
in Table 6, and for the German in Table 3.) 



Fig. 15. Average weights (indoor clothing and 
without Bhoes) for three British and a German 
series of boys. (Data for British series in 
Table 6, and for the German in Table 3.) 



Age (years) 

Fig. 16. Average weights (indoor clothing and 
without shoes) for two British and a German 
series of girls. (Data for British series in 
Table 6, and for the German in Table 3.) 






388 


Comparison of heights and weights of German civilians 


three have very similar forms. It may be noted that the 1943 British series of male in- 
dustrial workers differs appreciably from the 1929-32 representing the same class in 
showing, for both height and weight, greater means for ages under 18. For girls the Qerman 
series is markedly superior for height, but inferior for weight, to the wartime British series. 


6*3. Mean weights reduoed to the same height at each age of the 1946-7 German series 
are given in Table 7 and plotted in Figs. 17 and 18 together with those for the other 
German and British series. For boys the various curves run nearly parallel oourses and 
they suggest the following order : 


English 

boarding 

schools 

1936-8 

(heaviest) 


British 

industrial 

workers 

1943 


Berlin 
‘higher 
schools 1 
c . 1900 


British 

industrial 

workers 

1929-32 


Berlin 

orphanage 

1919 


Hamburg 
‘higher 
schools* 
c. 1878 


German 

general 

population 

1946-7 


Berlin 
‘lower 
schools* 
c. 1900 
(lightest) 


Noteworthy features of this sequence are the high position of the British series of wartime 
industrial workers in it, and the fact that the post-war German series does not oocupy an 
extremely low position. Its inferiority is most marked for the oldest age group (age 18). 
For ages 14-18 the mean reduced weights for the British wartime workers exceed the 
corresponding 1946-7 German values by amounts ranging from about 2-3 kg. (4- 5-7 *7 lb.). 

The reduoed weights for series of girls (Fig. 18) suggest the order: 


Berlin 
‘higher 
schools * 
c. 1900 
(heaviest) 


British 

industrial 

workers 

1943 


Berlin 
‘ lower 
schools * 
c. 1900 


Berlin 

orphanage 

1919 


German 

general 

population 

1946-7 


British 

industrial 

workers 

1926 

(lightest) 


A British pre-war standard is here below the post-war German, but the level of the 
British wartime series is decidedly high. The latter conclusion, which applies to both boys 
and girls, is satisfactory. It may be concluded that for the age range considered here — from 
the 14th to the 19th birthday — falls in the British 1943 standards for industrial workers less 
than 3 lb. would not be a serious matter. Reductions up to that limit, for both males and 
females, would mean that the class was still not inferior to its pre-war level. 


7. Comparisons of average weights standardized to the same heights for the 
post-war German and Royal Air Force and other British and Dominion 

SERIES OF ADULTS 

7*1. The comparisons in this section are of average weights at different ages for various 
series of men and women. Nude weights are treated, allowances for clothes given in § 3*5 
having been made w T here necessary, and all means are reduced to the height 1685 mm. for 
men and 1575 mm. for women by the method described in § 3-6. The series are: 

(а) R.A.F. and Dominion series of aircrew. The survey of heights and weights was made 
in 1944, All the men were in an advanced stage of training (at O.T.U.’s) or on operational 
duties in Great Britain (unpublished). 

(б) R.A.F. aircrew recruits. The majority of the men, measured in 1942, were direct 
entry civilians but some had served previously as R.A.F. ground staff. Many of the subjects 
of this survey must have been remeasured in the 1944 survey (a) above, though the exact 
proportion is unknown (Morant, 1943). 

(c) R.A.F. aircrew in training. When measured in 1944 the men had just returned from 
training overseas (Canada and South Africa), (Morant & Gilson, 1945). 



G. M. Morant 


389 


Table 7. Average weights (in indoor clothing and without shoes) in leg. reduced at each age to 
the average height of the German (British tone except Berlin) 1946-7 series for various 
other German and British series of children and adolescents * 



Boys 

Age 

(yeara): 

German j 


British 

• 

General 

Berlin 

Berlin 

Berlin 

orphanage 

1919 

Hamburg 
‘ Higher 
schools ’ 

Board- 

Indus- 

Indus- 

central 

values 

popu- 

lation 

* Higher 
schools’ • 

‘Lower 

ing 

trial 

trial 


schools’ 

schools 

workers 

workers 


1946-7 

c. 1900 

c. 1900 

c. 1878 

1936-8 

1943 

1929-32 

6*5 

21-2 

22*2 

21*3 

22*7 









7-5 

23*1 

23-7 

23*0 

24-2 

— 

— 

— 

— 

8*5 

25*4 

26*1 

25*0 

25*2 

— 

— 

— 

— 

9*5 

27*5 

27*7 

27*2 

27*4 

27-7 

— 

— 

— 

10*5 

30*0 

30*7 

29*4 

30*6 

30*2 

— 

— 

— 

11*5 

32*2 

33*3 

31*9 

33*2 

3*7 

34*5 

— 

— 

12*5 

35*0 

36*6 

34*8 

30*1 

35*7 

37*4 

— 

— 

13*5 

38*0 

40*8 

38*6 

39*3 

38*7 

41*2 

— 

— 

14*5 

43*4 

45-5 

42*1 

42*2 

44*4 

46*2 

46*5 

— 

15*5 

49*9 

51-5 

— 

— 

50*6 

52*5 

52*2 

50*7 

16*5 

55*4 

57*7 

— 

— 

55*9 

58*4 

57*7 

56*4 

17*5 

59*1 

60-4 

— 

— 

59-6 

62*4 

60*9 

600 

18*5 

61*1 

65-7 

— 

— 

63*4 

— 

63*2 

62*6 



Girls 

Age 
(years) : 


German 


British 







General 

Berlin 

Berlin 

Berlin 

orphanage 

1919 

Indus- 

Indus- 

central 

values 

popu- 

‘Higher 

‘Lower 

trial 

trial 


lation 

schools’ 

schools ’ 

workers 

workers 


1946-7 

c. 1900 

c. 1900 

1943 

1926 

6*5 

20*3 

22*0 

20*9 

207 





7-5 

22*2 

23*8 

22*6 

21*9 

— 

— 

8*5 

24*2 

26*8 

24-5 

23*9 

— 

— 

9*5 j 

26 4 

27*5 

26*3 

26*2 

— 

— 

10*5 | 

29*0 

31-9 

290 

29*0 

— 

— 

11*5 

32*0 

34*3 

32*4 

32*3 

— 

— 

12*5 

35*9 

40*1 

37*3 

36*9 

— 

— 

13*5 

40*0 

43*0 

41*3 

42*1 

— 

— 

14*6 

45*4 

49*9 

46*5 

471 

49*0 

— 

15*6 

49*9 

52*8 

— 

— 

53*0 

49*5 

16*5 

51*3 

— 

— 

— 

54*1 

51*1 

17*5 

53*5 

— 

— 

— 

53*9 

52*7 

18*5 

54*0 

— 



— 

55*4 

53*8 


* Another series of interest is of Belgian children of a working-class district of Brussels measured in the 
last three months of 1942, 1943 and 1944 (Ellis, 1945). To obtain large enough numbers at each age the data 
for the three years were pooled, giving n’s ranging from 45 to 309, most representing more than 200 indi- 
viduals. It is shown in the paper describing the records that the wartime children were above pre-war Belgian 
standards for height, and of the same order as the pre-war standards for weight. Reducing the mean weights 
(kg.) to the heights of the German 1946-7 series gives: 


Age (years): 
central values 

6*5 

7*6 

8*5 

9*5 

10*5 

11-5 

12*5 

13*5 

14*5 

15*5 

Boys 

21*1 

22*7 

24*9 

27*3 

30*3 

32*3 

35*4 

39*0 

44*1 

60-3 

Girls 

20*4 

22*0 

24*4 

26*3 

29*1 

32*7 

36*9 

42*4 

48*0 

52-2 


Compared with the series in the table above these roduced weights (in indoor clothing and without shoes) are 
extremely low for boys aged 6, 7 and 8, and low but not extreme in nearly all other oases 

Biometrlkft 35 23 







*5 7-5 O 5 9-5 105 115 12-5 13*5 1*5 15-5 165 175 105 6*5 7-5 05 *5 105 11* 12-5 13* 1*5 1*5 105 17-5 105 

(years) Afe (years) 

Fig. 17. Average weights (indoor clothing and without shoes) of Fig. 18. Average weights (indoor clothing and without shoes) of 

German and British series of boys reduced to the average German and British series of girls reduced to the average 

heights of the German 1946-7 series. (Data in Table 7.) heights of the German 1946-7 series. (Data in Table 7.) 






G. M. Morant 391 

(d) R.N. pilots. The measurements used were those of the men when they were recruited 
for the Royal Navy from 1940 to 1946 (Morant, 1947). 

(e) R.A.F, ground staff. The series represents a random sample of men recruited from 
1940 to 1944, the measurements being those recorded at their first medical examinations 
(unpublished). 

(/) Conscripts examined from 1 November 1917 to 31 October 1918. The data are for ' 
men examined by recruiting Medical Boards of the West Midland Region of England 
(Ministry of National Service, 1,920). Those who were rejected for service in the Forces are 
included. At the time many who had previously been rejected were recalled for examina- 
tion. 

(g) Series of pre-war industrial workers. The surveys, carried out for the Industrial 
(Fatigue) Health Research Board of the Medical Research Council, were of women in 1926 
(Cathcart et cU. 1927) and of men in 1929-32 (Cathcart el al. 1936). 

(h) Series of wartime industrial workers. The survey was carried out for the Ministry of 
Food in 1943 (Kemsley, 1946). Mean heights and weights given in the report were used to 
obtain the reduced weights shown in Figs. 20 and 21. 

It should be noted that the Service series (6), (d), (c) and (/) are made up by men of whom 
all, or most, were civilians up to the time measurements were recorded. 

7*2. The reduced weights for all the British and for the post-war German series are 
plotted in Figs. 19 and 20 (men) and 21 (women). In the case of the men there are far more 
series for ages up to 40 than for ages over 40. Considering the younger age range (17-40 
years) only the series suggest the following order, though the sequence is not exactly the 
same for all ages within the range : 

Operational aircrew : New Zealand heaviest 

Australian 
Canadian 

R.A.F. : Aircrew in training 
Operational aircrew 
R.N. pilot recruits (most civilian) 

R.A.F. aircrew recruits (most civilian) 

R.A.F. ground-staff recruits 
Industrial workers, 1943 
Employed industrial workers, 1929-32 
Germans, 1946-7 

Conscripts, 1917-18 and unemployed industrial workers, 1929-32 lightest 

This sequence is clearly significant. During the war, aircrew were the best-fed section of 
the British population, though the R.A.F. did not reach the high weight levels of the 
Dominion aircrew. The superiority of the New Zealand series is suggestive in view of the 
high nutritional and health standards of that country. The slight superiority of the aircrew 
in training over the British operational aircrew might be due to the fact that the former 
were measured immediately after their return from overseas ; or the distinction might be 
partly due to the loss in weight, on the average, during the earlier stages of wartime 
operational tours, which was found to be of the order 1*5 lb. (Reid, 1947). Aircrew recruits 
(R.N. pilots and R.A.F.) come next, followed by R.A.F. ground-staff recruits. When 
plotted together, the R.A.F. ground-staff recruits (Fig. 19) are seen to have greater mean 


25-2 



392 Comparisons of heights and weights of German civilians 

weights than the wartime and pre-war industrial workers (Fig. 20) for ages up to about 
28 years, but smaller means for all later ages represented. The younger men accepted for 
service, chiefly on medical grounds, must have been heavier, on the average, than the 



Fig. 19. Average weights (nude) standardized to height 1685 mm. for the post-war German aeries of 
civilians and series of R.N. and R.A.F. personnel. The smallest series are of R.N. pilots (n = 200) , R.A.F. 
aircrew in training (529) and New Zealand aircrew (550), all the others being of 1600 or more men. 

population from which they were drawn, but for the older men the position was reversed. 
For men up to age 40 it is surprising to find that the post-war German series has to be 
classed rather above two earlier British ones. Compared with them it shows superiority in 
average weights for ages up to about 25 years, and the same level as theirs for ages 25-40. 



G. M. Morant 


393 


For ages over 40 (Fig. 20) the situation is markedly different. The German series dearly 
falls to the lowest place and the wartime industrial workers fall below the pre-war employed 
workers of the same class, though remaining above the low levels of the pre-war unemployed 



Fig. 20. Average weights (nude) standardized to height 1685 mm. for the post-war German and British 
series of civilian men. The smallest series is of 1 ,328 unemployed industrial workers and each of 
the other four relates to more than 10,000 men. 




394 Comparisons of heights and weights of German civilians 

industrial workers and the 1917-18 conscripts. Food rationing affects most markedly the 
weights of the middle aged and aged. 

This is shown again by the few female series (Fig. 21). The maximum of the age curve for 
the German women is in the twenties— compared with the forties for the men (Fig. 20) — 
and at age 57 the German mean is nearly 10 kg. (22 lb.) below that of the British wartime 
workers. The corresponding series of men show a maximum divergence at age 75 of nearly 
7 kg. (15 lb.). In general the absolute, and still more clearly the relative, weight losses of 
German adults since the war must have been greater for women than for men. 



20 30 40 SO 60 70 

Age (years) 

Fig. 21. Average weights standardized to height 1575 mm. for two British and a German series of 
women. The 1926 series is of 3,000, and the 1943 series of 31,260, British women. 

7-3. The normal form of the age curve for weight in the case of both males and females 
shows continuous increase of the average up to some age about 60 years and then a decline 
in old age. Of the series considered the only ones which fail to conform to this pattern are 
the R.A.F. aircrew and ground-staff recruits (Fig. 19), which show increase to age 25 and 
then remain at about the same level, the German series of men (Fig. 20, maximum about 
40 years) and women (Fig. 21, decline from ages 19-20 and maximum in early twenties), 




G. M. Morant 


395 

and the pre-war British series of women industrial workers (Fig. 21, abnormal decline 
between ages 19 and 24). All except the last of these series had been subject to civilian 
rationing, but the 1943 men and women industrial workers, who show the normal form of 
curve, were also subject to it. The peculiarities of the recent Beries are probably due to the 
interaction of rationing, of medical selection in some cases, and of other imponderable 
factors. 

7*4. The data presented provide standards which may be of use in future comparisons. It 
might be asked, for example, .what losses in weight for a particular civilian or Service 
population for which repeated records are available should be considered a matter of serious 
concern. It is clear that such a question Bhould only be considered with reference to 
particular years of age or age groups covering a few years. Statements regarding all men, 
say, or all boys, would be of no scientific value. 

7*5. Records of average weights appreciably lower than those for the post-war Germans 
are not easy to find. Comparison is made below with means for broad age groups given for 
United States prisoners in Santo Tomas camp, Manila, Phillipine Islands, 36 months after 
internment (Brown, 1946). The corresponding means for the Germans can only be estimated 
approximately as the data for them are in decennial age groups and no finer age distribution 
for the American series is given. Also the German weights are standardized to heights 
which are probably rather below the unknown averages for the American series : 


Mean weights (nude) in lb. 


Ages 

American 

German 

Difference 

Men 19-40 

124 

128 

4 

41-60 

122 

130 

8 

Over 60 

119 

128 

9 

Women 19-40 

101 

113 

12 

41-60 

100 

112 

12 

Over 60 

96 

109 

13 


The differences here are substantial and greater for women than for men. The evidence 
considered in this paper is consistent in showing for all adult ages that loss of weight due to 
restricted rations is greater for females than for males. 

This paper is chiefly concerned with a treatment of unpublished official records. The 
writer acknowledges with thanks permission to use such material given by : (a) the Public 
Health Branch of the Control Commission for Germany (British Element), by courtesy of 
Brigadier W. Strelley Martin, M.C., Public Health Adviser, Health Branch, C.C.G., and 
F. D. G. Bailey, Esq., formerly Nutrition Officer, Health Branch C.C.G.; (6) the Medical 
Directorate of the Air Ministry, by courtesy of Air Vice-Marshal P. C. Livingston, 
C.B., C.B.E., A.F.C., F.R.C.S., Direotor-General of Medical Services, Royal Air Force; 
(c) the Ministry of Food, by courtesy of the Chief Scientific Officer to the Ministry. 



396 


Comparisons of heights and weights of German civilians 


REFERENCES 

Brown, B. H. (1946). A detailed report on the weights and weight losses of twenty-four men in Santo 
Tomas internment camp. J. Lab . Clin. Med. 31, 1129-32. 

Catkgart, E. P., Hughes, D. E. R. A Chalmers, J. G. (1935). The physique of man in industry. 
M.R.C., Industr . Heal . Res. Board , Report no. 71. 

, Cathoart, E. P., Bed axe, E. M., Blair, C., Macleod, K. & Weatherhead, M. (1927). The physique 
of women in industry. MM.C., Industr. Fat. Res. Board , Report no. 44. 

Control Commission for Germany (British Element), Public Health Branch, No. 1 Nutrition 
Survey Team (1947). Report on a body -weight survey in the British zone. 

Coon, C. S. (1939). The Races of Europe. New York: Macmillan. 

Davidsohn, H. (1919). Die Wirkung der Aushungerung Deutsohlands auf die Berliner Kinder mit 
beeonderer Beriicksichtigung der Waisenkinder der Stadt Berlin. Z. Kinderheilk . 21 , 349-407. 

„ Dunstan, W. R. (1925). Height and weight of school children in an English rural area. Metron , 5. 
Elderton, E. M. (1914). Height and weight of school children in Glasgow. Biometrika , 10, 288-339. 
Ellis, R. W. B. ( 1945). Growth and health of Belgian children during and after the German occupation 
(1940-1944). Arch . Dis. Childhood , 20, 97-109. 

Friend, G. E. (1935). The Schoolboy: a Study of his Nutrition , Physical Development and Health. 
Cambridge: Heffer. 

Friend, G. E. & Bransby, E. R. (1947). Physique and growth of schoolboys. Lancet , 8 Nov., p, 677. 
Khmsley, W. F. F. (1945). A Preliminary Account of the Ministry of Food's Survey into the Weight and 
Height of the Population during 1943. Ministry of Food Surveys Branch, DPD/W/6. 

Kemsley, W. F. F. (1946). Body Weight Survey: Report on the Weighings taken between April 1943 and 
October 1945. Ministry of Food Surveys Branch, DPD/W/7. 

Kemsley, W. F. F. (1947). Body Weight Survey: Variations in Body Weight — 1943 to 1946. Ministry of 
Food Surveys Branch, DPD/W/8. 

Kotelmann, K. (1879). Die Korperverhaltnisse der Gelehrtenschiiler des Johanneums in Hamburg. 
Z. kgl. preu&8. 8 tat. Bur., Berlin , 19, 1-16. 

Ministry of National Service, 1917-19 (1920). Report , Vol. 1, upon the Physical Examination of Men 
of Military Age by National Service Medical Boards from November 1st, 1917 to October 31*£, 1918. 
H.M.S.O. 

Morant, G. M. (1943). Preliminary report on body measurements of 2,400 candidates for air-crew. 
Flying Personnel Res. Comm. no. 538. 

Morant, G. M. (1947). The heights and weights of Royal Naval pilots compared with those of R.A.F. 
pilots. Flying Personnel Res. Comm. no. 673. 

Morant, G. M. & Gilson, J. C. (1945). A report on a survey of body and clothing measurements of 
Royal Air Force personnel. Flying Personnel Res. Comm. no. 633a. 

Mumford, A. A. (1927). Healthy Growth: a Study of the Relation between the. Mental and Physical De- 
velopment of Adolescent Boys in a Public Day School. Oxford University Press. 

Ranke, O. (1905). Beitrage zur Frage kindlichen Wachstums: anthropologische Untersuchungen 
ausgefuhrt an holsteinischen Kindem von der Geburt bis zum vollendeter 1 5 Jahre. Arch . Anthrop., 
N.F., 3, 161-80. 

Reid, D. D. (1947). Some measures of the effect of operational stress on bomber crews (F.P.R.C. 605). 
Psychological Disorders in Flying Personnel of the Royal Air Force investigated during the War 
1939-1945, Air Publ. 3139, pp. 245-58. H.M.S.O. 

Reetz, E. (1903). Das Wachstum der Berliner Schulkinder wahrend der Sehuljahre. Arch. Anthrop., 
N.F., 1, 30-40. 

Schlesinger, E. (1917). Das Wachstum der Knaben und J tingling© vom 6 bis 20 Lebensjahr. Z. 
Kinderheilk. 16, 265-304. 

Schmid-Monnard, K. (1895). Uber den Einfluss der Jahreszeit und der SchUle auf das Wachstum der 
Kinder. Jahr. Kinderheilk. 40, 84-107. 



[ 807 ] 


TESTING THE SIGNIFICANCE OF CORRELATION 
BETWEEN TIME SERIES* 

By G. H. ORCUTT and S. F. JAMES, Department of Applied Economics, 

University of Cambridge 

I. Introduction 

# 

Verification that two things are related to each other is achieved by showing empirically that 
there is a correspondence between their behaviours greater than could be expected by chance. 
Where continued experiment is not possible the data available are usually severely limited 
in both quantity and range of variation and it therefore becomes essential to have precise 
ideas whether the agreement with a hypothesis is actually a verification of it or whether it is 
more reasonable to consider the agreement as merely the result of a chance correspondence. 
This paper discusses the problem of deciding when a correlation between two economic time 
series is great enough to make it unreasonable to assume that the series are unrelated. 

The testing of the significance of a correlation involves a comparison with what would 
have been obtained between non-related series thought to be analogous to the observed 
series. And, of course, the significance found for the correlation will depend upon the 
analogy deemed to be appropriate. The choice of an analogy depends upon experience as to 
which aspects of the real series being correlated are vital, in the sense that they affect the 
probability of obtaining chance correlations between non-related series. We can never be 
certain that some important aspect has not been overlooked, but, as our experience is 
broadened and we learn to take into account more and more factors, the chances of our 
running into a situation in which the analogy we choose is actually misleading becomes less 
and less. If we were forced to base our choice of an appropriate analogy to use in any indi- 
vidual situation on the data of that situation alone, the uncertainty of our tests of signifi- 
cance would be very great. Usually, however, the situation is one which we have learned to 
be similar to some larger class of experiences, and it is this larger class of experiences that 
furnishes a greater measure of information as to the analogy appropriate to all members of 
the class. 

The most commonly used sampling model for generating series of independent terms is to 
draw them at random from a normal population of values; and in applying tests based upon 
this sampling model one must take account of the length of the series being dealt with. 
Fortunately, there is some evidence that tests of significance based on this sampling model 
are insensitive to variation of the frequency distribution of the population of values from 
which the random sampling is done (Pearson, 1931), and this makes it reasonable to apply 
such tests even when little is known of the frequency distributions from which the items of 
our real series have been drawn. There is, however, one obvious point at which the analogy 
underlying such tests of significance may break down when one is concerned with economic 
time series, namely, if the consecutive terms are really correlated. In economic time series, in 
meteorological time or spatial series, or, for that matter, in biological time series, autocorre- 
lation usually exists. Production, or employment, or price-level series never go directly from 

* Mr James was largely responsible for the part embodied in section III of this study. We both wish to - 
express our appreciation for the large measure of assistance given us by Mr Richard Stone. 



398 


Testing the significance of correlation between time series 

high values to low values, but, instead, high values are followed by values which are also high 
and a transition from high values to low values only takes place over a period of time. How 
closely successive values are related will of course partly depend upon the time between 
measurements and, as this time is made shorter and shorter, successive values of the series of 
measurements become more and more like their immediate neighbours. Autocorrelations are 
often very high and remain high even as the series are lengthened, whereas in random series 
the autocorrelations are small tending to zero as the series increase in length. In economics, 
most of the material that we wish to investigate for relationships exhibits autocorrelation, 
and there is a real need for a test of significance for correlations whioh is based on a more 
realistic sampling model. 

Bartlett (1935) obtained the following large sample approximation of the variance of the 
sample correlations between two autocorrelated series having a true correlation of zero : 

„ rr n + 2 [( n - l )PiP'i + ( n - 2 )PiP'* + ---+Pi~ 1 Pi , ~ 1 ] m 


or more approximately 


var r ~ 


1 1 

nl-ptPi 


(2) 


where p x is the true value of the first autocorrelation of one of the two series and p[ is that 
of the other. This is based on the assumption that each of the series was generated by 
the following type of process 

x t = Pi x i-i + e t, (3) 

where the random error term, e„ is independent of x t-1 and E(e,) = 0. Since then, Bartlett 
(1946, 1947) and Quenouille (1947) have given a large sample approximation of var r for 
correlations between any two autoregressive schemes having a true correlation of zero.* 
For linear autoregressive schemes it is 


n + 2[(n - 1 ) pyp[ + (w - 2 )p 2 p' + . . . + Pn-i P^] 
«. 2 


( 4 ) 


00 

or more approximately var r ~ £ PtPtl n • (5) 

-00 

Quenouille (1947) has given a convenient method of evaluating an expression such as (6). 

Now while these formulae make it perfectly clear that, in interpreting the significance of 
a correlation between two series, it is necessary to take account of their autocorrelations, 
they have, as is recognized, certain practical limitations. Besides being based on the true 
autocorrelations, which are never obtainable in practice, they involve large sample assump- 
tions which might not be reasonable for series of twenty or thirty items with which the 
economist must usually deal. It is also evident that, given only the formulae for var r, it is 
impossible, even neglecting the above considerations, to apply a test of significance without 
some knowledge concerning the shape of the distribution of sample correlations. 

With the above difficulties in mind, we decided that a sampling experiment would give 
some guidance as to a reasonable procedure for carrying out tests of significance in the case of 
small samples. Since we could only carry out a rather limited sampling experiment, we were 
anxious that the sampling model used should generate unrelated series which were as 
analogous as possible to economic time series. In this way we might hope to obtain the 

* The reader is also referred to a useful paper by Moran (1947), which gives the formula for the variance 
of the covariance between two series having known autoregressive properties. ' 



G. H. Obcutt and S. F. James 


399 


maximum guidance in a region of practical interest. Yule (1921, 1920, 1927), Wold (1938) 
and Kendall (1944, 1945, 1946) have stressed that for most economic lime series an auto- 
regressive scheme is probably more relevant than the assumption of exact harmonic oscilla- 
tion and each of the above has made considerable use of the linear second-order auto- 
regressive scheme in studies of economic time series. Orcutt (1948a) tested the hypothesis 
that the economio time series used by Tinbergen (1939) might be considered to have been, 
obtained by drawings from a single population of linear stochastic series all having the same 
underlying autoregressive structure. This hypothesis was brought to a test by comparing the 
means and variances at each lag of the correlograms of the economic series with the means 
and variances at corresponding lags of the correlograms of several sets of other series con- 
structed according to a variety of models. On the basis of these comparisons and also a 
similar set of comparisons of correlograms of first differences, the conclusion was reached 
that so far as the evidence went the set of 52 economic series might have been obtained by 
drawings from the population of series generated by the model 


^tt+i) = Y t +0'3(Yi~ 1<*_d) + e«+n> (®) 

where e is random in time and has an expected value of zero. 

Since equation (6) appears to us as the best available model for generating non-related 
series which are analogous to economic time series, we have used it as a basis for generating 
the series used in the remainder of this paper. It 
should be noted that equation (6) does not gen- 
erate stationary series but rather a Brownian 
type of movement having no true mean. See 
Wold (1938, p. 63) for the distinction between 
stationary and evolutive time series. On the 
other hand, the series generated by equation (6) 
are not explosive in the sense that they tend to 
deviate from any given point or set-up oscilla- 
tions of ever increasing amplitude. Six 30-item 
segments which were selected without regard to 
shape from a long series generated by equation 
(6) are shown in Fig. 1 . Since the formulae given 
earlier for var r were derived on the assumption 
of stationary autoregressive processes, it is clear 
that on this account alone it would not be 
safe without additional evidence to apply 
them to correlations between non-stationary Fig j 30-item segments of constructed 
series such as generated by equation (6). series. 



II. Empirical distributions of correlations between non-related 

SERIES GENERATED BY THE AUTOGRESSIVK PROCESS 

^a+« = IJ + 0-3(3(— !(/_!)) + e ( f +1) 

We therefore sought to obtain empirically some idea of the distribution of correlations to be 
expected by chance between series drawn at random from the population of non-related 
series generated by the stochastic process of equation (6). In order to do this we first con- 
structed a series of 3240 items by means of (6). The random elements used were two digit 




400 Testing the significance of correlation between time series 

numbers taken from Tables of Random Sampling Numbers by M. Q. Kendall and B. 
Babington Smith (1039). They were read from left to right and double zeros were omitted. 
To obtain a true mean of zero, 60 was subtracted from eaoh. Thus the random numbers used 
were drawn from a population having a rectangular distribution and a range of — 49 through 
0 to -h 49. 

Our long series of 3240 items was first divided into 36 series each of 90 items. Each of these 
36 series was then divided into three series of 30 items. The first 30-item series of the first 
90-item series was labelled 1 A, the second 1 B and the third 1 C. The first 30-item series of the 
second 90-item series was labelled 2 A, the second 2B and so on. 

By the use of the usual product-moment formula for the correlation coefficient, correla- 
tions were obtained between all possible pairs of A series, between all possible pairs of B series 
and between all possible pairs of C series. That is, series 1 A was correlated in turn with series 
2A, 3A, .... 36A; series 2A was correlated in turn with series 3A, 4A, .... 36A, and so on. 
Thus we obtained 630 correlations between pairs of A series, 630 between pairs of B series and 
630 between pairs of C series, making a total of 1890 correlations between different pairs of 
our 108 series each of 30 items. Then labelling the first 60 items of each 90-item series, series 
(A + B), we found the correlations between all possible pairs of the 36 (A + B) series. Having 
obtained these 630 correlations between series of 60 items we then found the 630 correlations 
obtained by correlating all the pairs of our 36 series of 90 items each. These series are labelled 
the (A + B + C) series. The labour of obtaining the above correlations was rather large but the 
calculations were considerably facilitated by use of a new type of calculating machine 
(Orcutt, 19486). Now while it can be shown that, when the series are independent, the above 
sampling procedure will lead to unbiassed estimates of the moments of the population of 
correlations between series drawn at random from the universe of series generated by the 
autoregressive process used, it is evident that the 1890 correlations obtained are not com- 
pletely independent, so that, while the effective number is substantially greater than 108 
(the number of independent series), it, nevertheless, is considerably less than 1890. The 
reason for using each series a large number of times is simply the saving of labour. 

Table 1 gives the frequency distribution for our constructed series with n — 30, n = 60, 
and n = 90, of the ratio of the mean square successive difference to the variance. This ratio is 
usually denoted by £ 2 /s 2 and for infinite series 8 2 js 2 = 2(1 —p x ). For a discussion of this ratio 
and tabulations of its probability distribution for random series, see von Neumann (1941, 
1942) and Hart & von Neumann (1942). 

Table 2 gives the frequency distributions of the correlations obtained between pairs of 
each of the sets of 30-item series and their total together with the frequency distributions of 
correlations obtained between pairs of the 60-item series and the 90-item series. Fig. 2 shows 
graphically the frequency distribution of the correlations for n = 30 together with a curve 
showing the frequency distribution of correlations between pairs of non-related random 
normal series with n — 6.* We tested the fit of this theoretical curve by means of a y 2 test 
with 20 classes and 19 d.f., since the variance of the theoretical curve has been approxi- 
mately fitted, and obtained a probability of less than 0-01 . Figs. 3 and 4 show the distribu- 
tions of the correlation coefficient which we obtained with » = 60 and n = 90, respectively. 
On the first we have drawn the curve for random series with n = 5 and on the second the 

* The value 5 was chosen to make the variance ] /(n — 1 ) agree as closely as possible with the 
observed variance 0*2736. Similarly n= 6 gives close agreement with the observed 0*2061 in Fig. 4. 



G. H. Obcutt and S. F. Jambs 


401 


curve for random series with » = 6. With 17 d.f., we obtained a y* of 16-3 corresponding 
to a probability of about 0 5 in the first case and with the aame degrees of freedom we 
obtained a x* of 15-9 corresponding to a probability of about 0*7 in the second case. 
Table 3 gives the cumulative frequency distributions from the total of the 30-item series, and 

Table 1. Frequency distribution of S 2 /s 2 for series generated by 
Y tt+i) — Y t + 0-3(7, - T^d) + 


P/S* 

Frequency 

n = 30 

11 

§ 

n=90 

0 - 00 - 0-10 

40 

23 

30 

0 - 10 - 0-20 

27 

9 

6 

0-20-0-30 

23 

1 

0 

0-30-0-40 

5 

1 

0 

0-40-0-50 

4 

1 

0 

0-50-0-60 

4 

1 

0 

0-60-0-70 

2 

0 

0 

0-70-0-80 

2 

0 

0 * 

0-80-0-90 

1 

0 

0 

Total 

108 

36 

36 

Mean <$•/** 

0-195 

0-104 

0-062 



Fig. 2. Frequency distributions of r, n = 30. Fig. 3 . Frequency distribution of r, n = 60. 


for the 60- and 90-item series. The above application of the y* test for testing the adequacy of 
the theoretical curves is admittedly rough, both because we have only approximately fitted 
the variances of the theoretical curves to our empirical distributions by our choice of « for the 
distributions of cqrrelations'between random series, and because the correlations making up 
our empirical distributions are not completely independent. The effect of both the above 





Table 2. Frequency distributions of r 



■ 

Frequency* 

r 



n = 

30 


n = 00 
Series 

n = 90 
Series 









Series A 

Series B 

Series C 

Total 

A, B, C 

A + B 

A + B + C 

-1-0 to 

-0*9 

6-5 

12*0 

2*0 

20*5 

4*0 

3*5 

— 0*9 

-0-8 

280 

32*5 

12*0 

72*5 

30-6 

10*5 

-0-8 

-0*7 

35*0 

49*0 

27*5 

111*5 

29*5 

25*0 

— 0*7 

— 0*6 

37*0 

37*5 

29*0 

103*5 

32*0 

31*0 

-0-6 

— 0*5 

330 

31*0 

30*0 

100*0 

35*0 

37*0 

— 0*5 

-0*4 

37*5 

25*5 

32*0 

95-0 

40-0 

40*0 

-0-4 

— 0*3 

43*5 

18*5 

41*5 

103*5 

30*0 

38*5 

-0-3 

-0-2 

30*0 

32*0 

38*0 

100*0 

20*5 

37*0 

-0-2 

-0*1 

30-0 

31*5 

47*0 

108*5 

30*0 

44*5 

-0*1 

-00 

34*0 

470 

22*5 

103-5 

49*0 

45*5 

+ 00 

+ 01 

31*0 

39*0 

30*5 

100*5 

440 

54*0 

+ 01 

+ 0*2 

30*0 

31*5 

40*0 

107*5 

39*5 

40*0 

+ 0-2 

+ 0*3 

34*5 

20*5 

42*5 

103*5 

39*5 

30*5 

+ 0-3 

+ 0*4 

29*5 

25*0 

34*0 

88*5 

39*0 

40*5 

+ 0*4 

+ 0-5 

30*0 

28*0 

48*0 

100*0 

37*0 

38*5 

+ 0-6 

+ 0-6 

40*5 

44*5 

43*5 

128*5 

21*0 

31*5 

+ 0-6 

+ 0-7 

30*5 

39*0 

35*5 

111-0 

33*5 

27*5 

+ 0-7 

+ 0-8 

37*0 

30*0 

31*0 

98*0 

25*0 

30*0 

+ 0*8 

+ 0-9 

25*0 

37*0 

24*0 

80*0 

23*6 

11-5 

+ 0-9 

+ 10 

15*5 

13*0 

1*5 

30*0 

3*5 

1*5 

Total 


030*0 

030*0 

030*0 

1890*0 

030*0 

030*0 

Mean 


0*0048 

0*0004 

0*0439 

0*0104 

-0*0248 

-0*0023 

s* = Sr*/n 


0*2844 

0*3071 

0*2293 

0*2736 

0*2440 

0*2061 

8 


0*5333 

0*5542 

0*4788 

0*5231 

0*4945 

0*4540 

fix 


0*0047 

0*0002 

0*0335 

0*0044 

0*0140 

0*0002 

fit 


1*8125 

1*7720 

1*9238 

1*8476 

1*9422 

2*0357 


Table 3. Cumulative frequency distributions with positive and negative r's combined 


M 

Fraction greater than 

M 

n - s 30 

o 

CO 

II 

si 

n = 90 

0*00 

1*00 

1*00 

1*00 

0*10 

0*89 

0-85 

0*84 

0*20 

0*77 

0*73 

< 0-70 

0*30 

0*00 

0*03 

0*59 

0*40 

0-56 

0*51 

0*40 

0*50 

0*40 

0*38 

0*33 

0*00 

0*33 

0*29 

0*22 

0*70 

0*22 

0*18 

0*13 

0*80 

0*11 

0*10 

0*04 

0*85 

0*00 

0*05 

0*03 

0*90 

0*03 

0*01 

0*01 

0*95 

0*01 

0*00 

0*00 

1*00 

0*00 

0*00 

000 


* Values of r were calculated to two places of decimals. Those on the border of two class- 
intervals were allocated one-half to each interval. The same applies to Tables 4 and 5. 





G. H. ORCtTTT AND S. F. Jambs 403 

circumstances will be to tend to exaggerate the ^ a ’s obtained and so underestimate the true 
probabilities that discrepancies between the empirical frequency distributions and the 
theoretical distributions are chance results. 

Table 4 gives the distribution of 308 first-order partial correlations where n = 30 and 
two explanatory series are used. These 306 comprise the combinations available from our 
correlations between 30-item series using only partial correlations of the types, r <(mUi+a) , ' 
r <(<+*). (<+i) fy+D«+a).<* Their frequency distribution is shown on Fig. 5 along with a 

curve showing the frequency distribution of correlations between pairs of non-related 
random series with n * 5. 

Table 5 gives the distribution of 306 multiple correlations involving two explanatory 
series. These correlations were obtained from the above partial correlations and zero- 
order correlations. They are of the types, R iM+ m + i» *« + d.« <+ 2 > and Their 

frequency distribution is shown on Fig. 6. 




r 


Fig. 5. Frequency distribution of first order 
partial correlations, n = 30. 


In the next section we shall investigate whether sample-values can be used for unknown 
parameters in determining probabilities for the testing of correlations, but there are certain 
observations which might usefully be made at this stage. In the first place, it should be 
evident that the empirical distributions of this section provide a test of the null hypothesis 
that a given sample correlation might have occurred between series drawn at random from 
the population of series generated by equation (6). Since the distributions change very 
slowly with n, the length of the series, it should be possible to interpolate for a particular 
value of n with as much accuracy as our distributions justify. The position with regard to 
partial and multiple correlations is not so satisfactory since we have merely obtained, for 
n = 30, distributions of first-order partial correlations and multiple correlations involving 
two explanatory series. It is hoped, however, that these will be sufficient to give some idea of 
how high these coefficients must be in order to provide significant evidence against the null 
hypothesis and to throw some light on the possibility of getting very high multiple correla- 
tions between non-related series when n is less than 30 and four or five explanatory series are 
used. 



404 Testing the significance of correlation between time series 

Table 4. Frequency distribution of first order partial correlations between series with n = 30 


'if* 

Frequency 

— 1*0 to —0*9 

2*0 

-0*9 v -0 8 

5*0 

-0*8 -0*7 

7*0 

-0*7 -0*6 

15*0 

-0*6 -0*5 

19*0 

-0*5 -0*4 

12*0 

-0*4 -0*3 

24*0 

-0*3 -0*2 

28*0 

-0*2 -0*1 

18*0 

-0*1 -0*0 

17*5 

+ 0*0 4-0*1 

13*5 

4-0*1 4-0*2 

23*0 

+ 0-2 +0-3 

22*0 

+ 0-3 +0-4 

7*0 

+ 0-4 +0-6 

20*0 

+ 0-5 + 0*6 

25*0 

+ 0-6 +0-7 

17*0 

+ 0*7 +0-8 

13*0 

+ 0-8 +0-9 

14*0 

+ 0-9 +1-0 

4*0 

Total 

306*0 

Mean 

0*0502 


0*2341 

8 

0*4838 

fix 

0*1053 

fix 

1*9003 


Table 5. Frequency distribution of multiple correlations between series with n = 30 


R i.i* 

Frequency 

4" 0*9 to 4* 1*0 

220 

4-0*8 4-0*9 

53*0 

+ 0-7 +0-8 

70*0 

4-0*0 4-0*7 

42*5 

+ 0-5 +0-6 

39*0 

+ 0-4 +0-6 

29*0 

4-0*3 4-0*4 

18*5 

4-0*2 4-0*3 

18*5 

4-0*1 4-0*2 

8*5 

4-0*0 4-0*1 

5*0 

! 

Total 

300*0 

Mean 

0*0314 

** 

0*4470 

8 

0*6686 

A 

0*4474 

fix 

2*8201 








G. H. Obotttt and S. P. James 405 

In the second place, when we first discovered that the variance of the distribution of the 
zero order correlation does not become substantially smaller as n is increased* from SO to 60 
and 90, we thought this implied that little was to be gained by use of greater lengths of time 
series in forming estimates of inter-relationship. This, however, does not necessarily follow, 
even if it be granted that economic time series have approximately the autoregressive pro- 
perties of our constructed series. In particular, it does not follow if we imagine that we are 
dealing with a relation between such time series in which the error term in the relation is 
a random variable of constant expected variance over time. In this case, the variance of the 
related series will continue to grow with time since there is no true mean, but the variance of 
the error term will not. Therefore, as the series become longer, the variance of the error term 
will become a smaller and smaller fraction of the variance of the series being explained. This 
implies that the correlation coefficient will become higher and higher approaching unity as 
the series approach an infinite length. Thus, whilst almost as high correlations are to be 



Fig. 6. Frequency distribution of li t ik , n = 30. 


expected by chance between series of n = 90 as for n ~ 30, substantially higher correlations 
are to be expected as n increases if there is a real linear relation subject to a random error of 
constant expected variance over time. On the other hand, if the error term is not random in 
time but itself has continuity properties such as those of our series (6), then its expected 
variance will grow as the series become longer in the same way as the expected variance of our 
actual series grows. There will thus be no reason for the correlation to increase as the series 
become longer and we shall, in fact, be in the position of having gained little from the ex- 
tension of the series. In this case consideration needs to be given to the possibility of corre- 
lating something like first differences in order to obtain any substantial advantage from the 
use of longer series. 

In the third place, since for n — 60 and n — 90 we obtain good fits to our empirical distri- 
butions by means of the theoretical distributions of correlations between random series for 

* We have received a very interesting letter from R. C. Geary in which he shows that for independent 
series generated by the rather similar process, Y, = Y,^ + f.„ the variance of correlations between series 
tends toward a non-zero finite positive quantity which cannot be very small. 

Bionoetrika 35 


26 




406 


Testing the significance of correlation between time series 


» a 5 and» a 6, it follows that we might with some justification use these distributions for 
es timating the significance of correlations between economic time series. See David (1938) 
for tables of the correlation coefficient. Even in the case of » * 30, the use of the distribution 
for random series of » = 5 would not appear to overestimate the significance of correlations 
greater than 0-9. 

Finally, it may be of some interest to note the result if we make use of the mean first lag 
autocorrelation of our 108 series with n = 30 and attempt to estimate var r by means of 
equation (1). However, instead of evaluating equation (1) as given, we reduced it to the 


following expression 


var r~ 


.(l+A'i) . 

w (i— r i r i) 


2^(1 

n 8 (l - 


- rf r' x n ) 
ViY ' 


( 7 ) 


We have also substituted r x for p x and r' x for p x since we intend to use sample rather than 
theoretical values of the autocorrelations. Then using 1 - £ (the mean £*/«* as given on 
Table 1) for r x and r a we obtain var r = 0-2759. This estimate should be compared with the 
variance 0-2736 of our empirical distribution given in Table 2, and considering that Bartlett 
assumed very long stationary series generated by a simple Markoff scheme, it is remarkable 
that the estimate of var r should be as close as it is. 


III. Investigation of the relation of var r to the sample valves 

OF THE FIRST LAG AUTOCORRELATIONS 

For the purpose of this study the 108 series, with n = 30, were separated into seven groups on 
the basil of the- observed values of their first lag autocorrelation, r x . The definition of the 
autocorrelation coefficient used for this purpose was 

»-i= 1-W (8) 

where S 2 = ( Y i+X - - 1 ), and s* = V(y.- F) 8 /n. 

i i 

As already mentioned, the distribution of the values of S i js t for our 108 series is given in 
Table 1. The limits of r x , the mean value of r x , and the number of series for each of the seven 
groups of series is given in Table 6. The limits were chosen so as to obtain approximately 
equal numbers of series in each group. 

Table 6. Classification of series on the basis of sample valves of first lag autocorrelations 


Group no. 

Limits of the r x 

r x for group 

No. of series 

included in group 

in group 

1 

0-570-0-832 

0-732 

15 

2 

0*832-0-887 

0-866 

15 

3 

0*887-0-912 

0-896 1 

15 

4 

0-912-0-946 

0-927 

18 

5 

0-946-0-963 

0-955 

18 

6 

0-963-0-975 

0-970 

14 

7 

0-975-0-988 

0-980 

13 


The set of 1890 correlations, for n = 30, were sorted into 28 classes corresponding to the 
combinations of groups from which the paired series yielding the correlations were drawn. 
The variance of these correlations about zero is given for each of the 28 classes by the top 



407 


G. H. Obctjtt and S. F. Jambs 

number in each box of Table 7. The middle figure in each of these boxes is the number of 
correlations that occurred in the class. The bottom figure in each class is a ‘theoretical 
variance ’ obtained by using equation (7) with r 1 and rj taken as the mean values of the first 
lag sample autocorrelations corresponding to the groups of series being paired. For example, 
to estimate the variance of correlations between series from set 1 and from set 3, we used 


Table 7. Actual and theoretical values of var r for each of the 28 classifications 
of r based on the sample autocorrelations of the paired series * 



Group no. of first series of each pair 

1 

2 

3 

4 

5 

6 

7 


1 


0121 

0*101 

0*110 

0*154 

0*146 




34 

75 

72 

78 

98 

64 

23 



0104 

0-138 

0*148 

0*169 

0*170 

0*178 

0*183 


2 


0*188 

0*159 

0*213 


0*263 

0*267 




31 

72 

86 

92 

73 

69 

•a 



0*206 

0*232 

0*259 


0*312 

0*324 

& 

A 

3 



0*135 


0*274 

0*243 

0*340 

I 




37 


81 

69 

56 





0*269 

0*295 

0*338 

0*366 

0*375 

O 

9 









i 

4 







0*352 

12 





57 

89 

98 

61 

§ 

i 





0*343 

0*403 

0*440 

0*457 










O 

5 






0*459 

0*601 

§ 






52 

76 

90 

Q* 








0*632 

a 

P 









O 










6 







0*599 








31 

49 









0*637 


7. 







0*784 









27 









0*701 


* The top number in each box is the actual variance, the middle number is the number of correlations 
in the class, and the lower number in each box is a ‘theoretical’ variance. 


0*732 for r x in equation (7) and 0*896 for r' v It will be noticed that equation (7) only differs 
from the more approximate form of equation (2) in the right-hand term. For low values of 
r x or r[ or large n, this term will be insignificant. Thus, whereas var r, as estimated by 
equation (7) for class 1-1, is only 0*006 less than the same estimate made by equation (2), we 
find that for class 4-4 the use of equation (7) gives an estimate which is 0*097 less than that 
obtained wring equation (2). For higher values of r x r x the difference becomes rapidly greater. 

26*2 














408 Testing the significance of correlation between time series 

In Fig. 7 we have plotted the observed value of the variance for each class against the 
‘theoretical’ value as obtained using equation (7). The straight line on this diagram is merely 
the 45° slope and would hold if the 1 theoretical ’ variance agreed exactly with the actual. 
Notwithstanding some evidence of a systematic disagreement, the agreement obtained is 
remarkably good when it is considered that the assumptions under which Bartlett obtained 
the formula used for var r are far from being realized in this case. Not only are we dealing 
with non-stationary series having a small n, but we have used sample autocorrelations 
rather than true autocorrelations, and in addition our series were not generated by a Markoff 
process. Evidently we shall not go far wrong, even with out type of series, if for purposes of 
testing the null hypothesis we estimate the variance of a correlation between two series by 
using equation (7). The error that this involves for series of our type would appear, on the 
basis of Fig. 7, to be an overestimation of the variance to the extent of about 15 % for almost 
the entire range of sample first lag autocorrelations covered by our experiment. 


Table 8. Frequency distributions of the absolute values of correlations between 
series grouped according to their sample autocorrelations 



A 

B 

C 

— 

i D 

i 

f 

E 

F 

(1 

M 

Classes 

Classes 

Classes 

• Classes 

Classes 

Classes 

( 'lasses 


1-1, 1-2, 

1 5, 1-6, 

2 3, 2-4, 

2-5, 2-6, 

4-4, 3-6, 

4-6, 4 7, 

6-6, 5 7, 


1-3, 1-4 

1-7, 2-2 

3-3, 3-4 

2-7, 3-5 

3-7, 4-5 

5 5, 5-6 

6 7, 7-7 

0 -0*10 

66*50 

56*25 

37*00 

29*75 

14*50 

19*75 

1*00 

010-0*20 

60-25 

37*00 

45*00 

33*50 

20*50 

21*75 

2-26 

0*20-0*30 

45*75 

43*50 

40-50 

29*25 

32*00 

19*50 

4*00 

0*30-0*40 

30*75 

35*50 

41*50 

34*00 

30*75 

17*75 

3-00 

0*40-0*60 

28*00 

27*25 

41*50 

49*50 

28*75 

24*00 

5*25 

0*60-0*60 

21*50 

25*50 

37*25 

48*50 

47*50 

32*00 

12*00 

0*60-0*70 

12*25 

1 18*00 

30*25 

! 38*25 

44*75 

| 37*75 

24*75 

0*70-0*80 

4*00 1 

i 19*25 

15*00 

j 36*50 

32*75 

61*00 

44*25 

0*80-0*90 

0*00 1 

3*75 

9*00 

| 15*75 

17*75 

48*00 

58*25 

0*90-0*96 

0*00 i 

0*00 

0*00 

0*00 

1*75 

5*50 | 

31*25 

0*96-1*00 

0*00 ! 

0*00 

0*00 

0*00 

0*00 

0*00 

i 

11*00 

Sum 

259 

266 

297 

315 ~ j 

271 

287 “I 

197 

Mean-square 
about 0 

0*1070 

0*1595 

0*1889 

0*2527 

0*2885 

0*3776 j 

0*6138 

fit (using 
moments 
about zero) 

2*447 

2*283 

1 

1*619 

| 

1*674 

1*559 

1*462 

i 

. I 

1*140 


Before the above method for estimating var r can be fully utilized, it is necessary to know 
something about the shape of the distributions involved and, in order to provide information 
about this point, we obtained the frequency distributions tabulated in Table 8 and shown in 
Figs. 8-14. The 28 classes of correlations were ordered according to the theoretical values of 
var r as given in Table 7. Then the correlations of the first four classes were grouped together 
into class A, the second four classes were grouped together into class B, and so on. Because 
these distributions should be symmetrical about zero and in view of the small sample sizes, 
we have obtained frequency distributions of absolute values of r. These distributions make it 
clear that the distribution of r for fixed values of p x and p[ depends on r x and rj to a very 



G. H. Orctttt and 8 . F. Jambs 


400 


oonsiderable extent. It is rather interesting to note the way in which the mode gradually 
moves from zero towards 1 as the variance of the distribution increases. It is also interesting 
to note that, even for very high variances such as for distributions F and G, the ordinates of 
the curves still appear to approach zero for r approaching unity. 

We were interested in seeing how well our empirical frequency distributions could be fitted 
by frequency distributions of approximately the same variance of correlations between series, 
of normally distributed random items. Since the variance of correlations between random 
series is l/(n — 1), we have used the distribution for random series of n = 11 to fit the fre- 
quency distribution of A, Fig. 10, the distribution for random series of n = 7 for B, that of 
n * 6 for C, and that of n = 5 for D. We did not bother with the rest since it is obvious that 
the fit would be completely unsatisfactory. As a rough test of the goodness of the above fits, 
we applied a x 2 test in each case. In case A we obtained a x 2 of 2-663 with 6 d.f. and this 
corresponds to a probability of above 0-8. In case B, x 2 was 5*105 with 6 d.f. and this 
corresponds to a probability of about 0*5. For C, x 2 was 9-082 with 8 d.f. and this corresponds 
to a probability of about 0*3. In case D, x 2 was 33*634 with 8 d.f. and this has a probability 
of less than 0*001 . The same remarks apply concerning the roughness of the above test as we 
made in § II and it should also be true here that our use of the x 2 test tends to under- 
estimate the probabilities. 

The primary implication of the results given in this section appears to us as follows. 
A reasonable way of testing the significance on the null hypothesis of a correlation between 
economic time series is first to estimate var r by means of equation (7) and, in so doing, to use 
the sample values of the first lag autocorrelations of the two series. Secondly, if the estimated 
var r is less than about 0*25, then insert it into the formula var r = 1 l(n' - 1 ) and evaluate n 
Round n' off to the nearest integer and make use of the distribution of r that applies for 
random series with n equal to this integer to evaluate the probability that r might have 
occurred by chance between tw r o non-related series. If the estimated var r turns out to be 
more than 0*25, then it seems unlikely that this method is valid, but the above test of signifi- 
cance would appear to be stronger than necessary and so would at least be a safe one to apply. 

Secondly, our study would seem to offer some evidence that the variance of correlations 
betw een pairs of unrelated series of a given n, and with given sample autocorrelations, will be 
nearly if not completely independent of the true autocorrelations of the series. This follows 
from the fact that in Fig. 9 the points follow very closely the line that they would have been 
expected to follow if we had been dealing with sets of correlations associated with sets of series 
having different true autocorrelations. If the true autocorrelations were actually known, one 
might, at least in principle, first evaluate the probability of independently drawing two series 
with certain sample autocorrelations, and secondly evaluate the probability of obtaining by 
chance a given correlation between series having their sample autocorrelations. Having done 
this, it might be possible to evaluate the combined probability of drawing two series inde- 
pendently which had the autocorrelations and correlation between them which were actually 
obtained. 



j JtA (eni^y 


410 


Texting the significance of correlation between time series 



Theoretical var r 



Fig. 7. 


Fig. 8 A. 




Fig. 8B. Fig. 10C. 





frequency frequency 







412 Testing ike significance of correlation between time series 


IV. Summary and a general remark on the problem or 
DETECTING REAL RELATIONS 

The problem dealt with in this paper was that of testing the significance of correlations 
between economic time series. We constructed a set of non-related series with the same 
properties as are believed to hold for yearly series of a substantial group of economic time 
series. We thenjobtained a large number of correlations between these constructed series 
and on the basis of various distributions of these correlations came to the following 
conclusions. 


1. On the assumption that the economic time series being dealt with are analogous to our 
specific model, Y M = Y t + 0-3(J^- Y,_j) + e„ correlations between pairs of series can be tested 
for the null hypothesis using our empirical distributions. Since it was shown that for n = 60 
and n — 90 good fits to our empirical distributions can be obtained by use of the ordinary 
distribution of correlations between random series with n = 5 and » = 6 respectively, it 
follows that an alternative procedure is to test correlations between economic Beries by 
means of these distributions. 


2. The distribution of correlations between non-related series depends primarily on the 
sample autocorrelations of the paired series and very little, if at all, on the true autocorrela- 
tions, given the sample values. This makes it reasonable to apply tests based upon sample 
autocorrelations, and it was shown that one reasonable procedure is to use equation (7) to 
estimate the variance of r and then use the estimated variance of r to select the appropriate 
distribution of correlations between random series. Having chosen this distribution, one 
can then test the significance of the correlation in the usual way. The properties of this 
conditional test might repay a theoretical examination. 

3. If economic time series are analogous to the constructed series used in this paper then, 
except in the cases where the sample autocorrelations happen to be low, such high correla- 
tions between economic time series may be expected by chance that we are unlikely to detect 
real relations. The distributions given for the partial and multiple correlations only accentu- 
ate this view. One method which suggests itself of at least partially extricating ourselves 
from this situation is to make an autoregressive transformation of the time series involved in 
such a way that at least one of the series becomes approximately random in time. When this 
has been done, the correlation between the transformed series may then be tested in the usual 
way (Bartlett, 1935). Thus, for example, suppose that 

XU = P X 2I + U I> ( 9 ) 

and the error term u, is generated by 

u t = au < _ 1 + e t , (10) 


where e ( is a random variable. If /? = 0, then x v = u, and an .appropriate autoregressive 
transformation is ^ ,* 


so that in terms of the transformed variables we have 


= px' v + e t . (11) 

Since one of the two variables is now random we can apply the usual test of significance of 
correlation between x' u and x' v with the consequent advantage of a great increase in the 
effective degrees of freedom. On the other hand, if /? / 0, then the expected value of the least 
square estimate of/? in equation (11) will still be the same as in equation (9) and the expected 



G. H. Orcutt and S. F. James 413 

value of estimates of the true correlation between x' u and x* will still be very nearly the same 
as the expected value of estimates of correlation between x u and x v . This means that in this 
case, at least, the transformed form ( 1 1 ) is far more effective than the untransformed form (9) , 
for the purpose of testing the null hypothesis. When /?/(), then the appropriate autore- 
gressive transformation for estimating /? is not one which leaves one of the series random but 
rather one that leaves the error term random. However, in the case /t = 0, it is evident that * 
when the error term is random then the variable which is taken to be dependent must also be 
random since it is entirely composed of the error term.* 

REFERENCES 

Aitken, A. C. (1935). On least squares and linear combination of observations. Proc. Roy . Soc. Edinb. 
55, 42. 

Bartlett, M. 8. (1935). Some aspects of the time correlation problem in regard to tests of significance. 
J. Roy . Statist . Soc. 98, 536. 

Bartlett, M. S. (1946). On the theoretical specification and sampling properties of autocorrelated 
time-series. J . Roy . Statist. Soc. Supplement, 8, 27. 

Bartlett, M. S. (1947). Stochastic Processes. Mimeographed notes of a course given at the University 
of North Carolina in the Fall Quarter, 1946. 

Champernowne, D. C. (1948). Sampling theory applied to autoregressive time-series. J. Roy . 
Statist. Soc., Series B, 10 (in press). 

Cochrane, I). <& Orcutt, G. H. (1949). Application of least squares regression analysis to relationships 
containing autocorrelated error terms. J. Amer. Statist . Ass. 44 (in press). 

David, F. N. (1938). Tables of the Correlation Coefficient. Cambridge University Press. 

Hart, B. I. & von Neumann, J. (1942). Tabulation of the probabilities for the ratio of the mean square 
successive difference to the variance. Ann. Math. Statist. 13, 207. 

Kendall, M. (J. & Babington Smith, B. (1939). Tracts for Computers, no. xxiv. Cambridge University 
Press. 

Kendall, M. G. (1944). Oscillatory movements in English agriculture. J . Roy . Statist . Soc. 106, 91. 
Kendall, M. G. (1945). On the analysis of oscillatory time-series. J. Roy. Statist. Soc . 108, 93. 
Kendall, M. G. (1946). Contributions to the Study of Oscillatory Time-Series. National Institute of 
Economic and Social Research, Occasional Papers x. Cambridge University Press. 

Moran, P. A. P. (1947). Some theorems on time series. I. Biometrilca, 34, 281. 

Orcutt, G. H. (1948a). A study of the autoregressive nature of the time series used for Tinbergen’s 
model of the economic system of the United States, 1919-1932. J . Roy. Statist. Soc., Series B, 
10, 1. 

Orcutt, G. H. (19486). A new regression analyser. J. Roy. Statist. Soc., Series A, 111 (in press). 
Pearson, E. S. (1931). The test of significance for the correlation coefficient. J . Amer. Statist . Ass. 26, 
128. 

Quenouillb, M. H. (1947). Notes on the calculation of autocorrelations of linear autoregressive 
schemes. Biometrika, 34, 365. 

Tinbergen, J. (1939). Statistical Testing of Business -Cycle Theories. Vol. 2: Business Cycles in the 
United States of America 1919-1932. Geneva; League of Nations. 
von Neumann, J. (1941). Distribution of the ratio of the mean square successive difference to the 
variance. Ann. Math. Statist. 12, 367. 

von Neumann, J. (1942). A further remark on the distribution of the ratio of the mean square suc- 
cessive difference to the variance. Ann. Math. Statist. 13, 86. 

Wold, H. (1938). A Study in the Analysis of Stationary Time Series . Uppsala. 

Yule, G. U. (1921). On the time correlation problem. J. Roy. Statist. Soc. 84, 497. 

Yule, G. U. (1926). Why do we sometimes get nonsense correlations between time series? J. Roy. 
Statist. Soc. 89, 1. 

Yule, G. U. (1927). On a method of investigating periodicities in disturbed series with special reference 
to Wolfer’s sunspot numbers. Philos. Trans. A, 226, 267. 

* For a fuller discussion of the problem of estimation when the error term is autocorrelated, see Aitken 
(1935), Champernowne (1948) and Cochrane & Orcutt (1949). 



[ 414 ] 


MISCELLANEA 


Note on the median of a multivariate distribution 

By J. B. S. HALDANE 


The median of a univariate distribution is an exceedingly useful parameter but, whereas the notions of 
the mean and mode can be applied without ambiguity to distributions in two or more dimensions, this 
is not so for the median. It is the object of this note to point out that when we are dealing with multi- 
variate distributions, there are two quite distinct sets of parameters, each of which possess some of 
the properties of the univariate median, while lacking others. 

The possibility arises from the fact that the univariate median is a location parameter associated with 
two quite different scale parameters. In the first place, for the distribution dF s f(x)dx, the median is 
defined as M t where 


c 


dF: 


l 


roo 

= dF = l. 

J M 

Integration here and throughout is understood in Stieitjes’s sense. 

fQi 

When so defined, the median is obviously associated with the quart iles defined by I dF = J and 

J — 00 

00 

dF = J, and with the interquartile range Q a — Q v 

Q* 

rao 

Secondly, however, we may define the median as the value M which minimizes I | M — x\ dF . 

J — 00 


Similarly, the mean can be defined as the value m which minimizes 




(m-x^dF. Now the 


minimum value of this quantity is simply the variance. Just as the mean is associated with the standard 
deviation as a measure of dispersion, so on this definition the median is associated with the mean deviation 
about the median. The more commonly used measure, the mean deviation about the mean, has perhaps 
less to recommend it, since it is not a stationary value, and therefore more liable to error if the corre- 
sponding scale parameter is in error. In geometrical language the median is the point the sum of whose 
distances from the representative points of the sample is a minimum. 

Both these definitions of the median are equivalent in the univariate case, and both are of course 
indeterminate if the number of members of a sample is even, unless an even number of them coincide 
with the median. The various devices which avoid this indeterminacy represent the median as a limit. 

When we pass to two or more dimensions these two definitions are no longer equivalent, and it seems 
worth while to distinguish the two analogues of the univariate median as the arithmetic and geometric 
medians. 

If we have a number of variates x, y, z t ... the arithmetic median is the set of values (X, Y f Z , ...), 
where X f F, Z 9 . . . are the medians of x,y,z , ... defined in either of the two above ways. When a?, y, z t etc., 
are different in kind it is obviously the only reasonable generalization. It has the merit of being invariant, 
like the median, when any of the variates is replaced by a monotonic function of it. But it is not invariant 
under a rotation of axes. 

For consider the arithmetical median of three coplanar points. If we take rectangular axes their 
co-ordinates are (x lt y x ) % (x %t y t ), (x tf y 9 ). Those of the median are (x mf y m ), where x m is the middle value of 
x 1$ x v x t if they are all different, and the value of the two equal ones if tWo are equal. Hence as we rotate 
the axes we find that the position of the arithmetic median changes. Unless it coincides with one of the 
apices of the triangle, one of the sides must subtend a right angle at it. In fact, the locus consists of those 
arcs of the circles which have the sides of the triangle as diameters which lie within the triangle (see 
Fig. 1). If the triangle has a right or obtuse angle, this angle lies on the locus, as does the foot of the 
perpendicular from it on the opposite side. If the triangle is acute angled, it passes through the feet of 
the three perpendiculars. Similarly, the locus of the arithmetic median of the vertices of a tetrahedron 
consists of portions of spheres. For an odd number of more than three points in a plane, the arithmetic 
median may always be one of them, or its locus may consist of a series of circular arcs. For an even number 
it is of oourse indeterminate, nnlftas one or other of the special conventions devised for the univariate 
case is used. 



Miscellanea '415 

The geometric median is defined as the point such that the sum of its distances from the sample points 
m m un xx m . It is invariant under a change of axes, but is not invariant when the scales in different 
directions are altered. Its sole value is therefore in problems of geometrical probability. It occurred in 
a problem of this type during the recent war, and might perhaps be of value in studies on such aggregates 
as star clusters, where we desire to find a representative point which is less affected than the centroid 
by outliers which may not be members of the cluster. 

The geometrical median of three coplanar points is the point in the triangle formed by them at which t 
each side subtends an angle of far, ‘provided that no angle of the triangle exceeds far. If one angle exceeds 
this value, the geometrical median is the obtuse vertex of the triangle. I have been unable to find any 
simple geometrical construction in the case of more than three points. It is, however, easy to show that 




the geometrical median is unambiguously defined. For let us take it (or per impossible , one of the geo- 
metrical medians) as our origin of Cartesian co-ordinates. Consider a set of n coplanar points (a? r , y T ). 
First suppose that no sample point coincides with the origin, and if necessary rotate the axes so that no 
axis passes through a sample point. Let R be the sum of the distances of the sample points from the point 
(x,0). Then » dR n 

R-Z [(s-av^ + ^J*, -rr = 2 L(* — a>) {(a: — *v)* + yj}-*]- 
r«= 1 «*» r-1 

This must be zero when x = 0. But 

d 2 R n 

= 2 [3/?{(*-av)' + !/?}-']• 

dx z r ~l 

All the terms in this sum are necessarily positive, since the denominator is the cube of a distance 
which is taken as positive and can never change its sign. Hence d* R/dx* is always positive, and i? has only 
one minimum. 

Next suppose that the median coincides with one of the points, say the first; then 
n dR n 

R= |*|+ E [(*-*,)* + »*]», — = ±1+ S [(*-av){(*-av)*+j^}-»], 

r-2 «* r-8 

d*R/dx* is positive as before, but dRjdx has a saltus at x = 0, increasing in value by 2, and c hang ing 
sign. R has therefore a sharp minimum. The proof in three or more dimensions is analogous. Changing 
to polar co-ordinates with tho origin as centre and the co-ordinates of the sample points as (p r9 0 r ) 9 it 
follows that if the median does not coinoide with one of them, 

£cos0 r = Lsin0 f = 0, 

whilst if it does so, these sums lie between ± 1. If several sample points coincide with the geometric 
median, the modifications are obvious. 

It is clear that the minimum sum of the distances, divided by the number of points, is the many- 
dimensional analogue of the mean deviation from the median in one dimension. 

To sum up, the arithmetical median is obviously to be preferred in ordinary statistical work, but the 
geometrical median has certain advantages in problems of geometrical probability. In either case it is 
desirable to state clearly how the median is defined. 



416 


Miscellanea 


A property of rank correlations 

By H. E. DANIELS 

The produet-moment correlation coefficient may be considered a satisfactory measure of the association 
between two variates, both because of its special relevance to the bivariate normal distribution, and, 
more generally, from the fact that the square of its population value is the fraction of the variance of one 
variate removed by linear regression on the other. 

When the data are presented in ranked form, however, the suitable choice of a measure of concordance 
between rankings is less obvious. Spearman’s analogous use of the product -moment correlation coefficient 
p between the ranks cannot be so easily justified, though when calculated from the sum of squares of 
rank differences it is seen to measure in a rather arbitrary sense the degree of agreement between ranks. 
Kendall’s coefficient r has a more direct interpretation in that it is a function of the total number of 
corresponding pairs of ranks which agree in order, and Moran (1948) has further shown that r is simply 
related to the least number of interchanges required to bring two rankings into perfect agreement. 
It may bo said, therefore, that r is a satisfactory measure of concordance in the sense that increasing 
values of r correspond to increasing agreement between the rankings as defined in either of these ways. 
In a previous paper (1944) I introduced the quantity 

r - 

as a general measure of the correlation between two sets of observations, ranked or otherwise, where 
a 4i , b 4j are scores assigned to corresponding pairs i, j of the two variables and a 4j = —a ii% b ti = — b 4i . 
Both p and r are included as special cases. It is of interest to see how far F may be justified os a suitable 
measure of rank correlation. 

Provided the scores have the same sign as the difference of the corresponding ranks, F takes the values 
± 1 when there is respectively complete concordance or discordance bet ween the rankings, and when the 
ranked attributes are independent, T is on tho average zero (though it must be remembered that a zero 
average value of T does not necessarily imply independence). A further property is, however, required 
before T can be accepted as satisfactory; it has to be shown that increasing values of F correspond 
in some way to increasing concordance between the rankings. 

A property of this type which one might expect of a rank correlation coefficient is that if any two 
corresponding pairs of ranks do not agree in order, tho value of the coefficient should increase when t he 
members of one of the pairs are interchanged. It is now shown that F has this property provided the 
scores a if , b 4j do not decrease with increasing rank separation and are not zero except for tied ranks. The 
scores for both p and r satisfy the conditions. 

Let p if q 4 be the ranks of the ith members of the two sets of observations, and suppose that p r > p 9 , 
q r < q $ , and that p r , p, are to be interchanged. The denominator of T is unaffected by any relative permuta- 
tion of the ranks. Initially, the numerator is 

% a u b 4 j = 2* a 4i b 4i 4 2*a f j b rf 4- 2*a if b ir -f 2 * 0 ,y b ti 4 2*a <g b it 4 a rB b rg 4 0* b„ 

= ^"dijbij 4 22 a r jb f j 4 22 a 9 jb 9 j 4 2a rt b r9 , 

where IF denotes summation over all values of i,j excluding r and 8. After interchanging p r , p 9 it becomes 

2*0^ by 4 22*0^ b r j 4 22 *0^ b 9 j — 2a rt b^ t , 

which is an increase of 

22 "(a g j — a ri ) (b $j — b ri ) — 4a rt b r$ = — 22(a g y — a ri ) (by — b rJ ). 

Now introduce the condition that a 4i is a non -decreasing function of p,— p,, and consider a ti — a ri for all 
values of j. The following are the possible alternatives. 

(i) Pi ^ p r >p 9 . a 9j >0 9 a r$ >0 and a 9i & a ri . 

(ii) p r >Pi^Pr a 9 f > 0, a ri < 0, so that a 9i - a ri = a 9f 4a^>0. 

(iii ) p r >pt> Pi • i < 0, a ri < 0 and a ir ^ a j9 , so that a 9i ^ a ri . 

Thus in all cases a 9j — a rj ^ 0 and in at least one case a tj — a rj >0. 

Similarly, since q r <q s the fact that b ti is a non -decreasing function of q 4 — q 4 implies that b tj — b rj < 0 
and < 0 at least when a, } — a rj > 0. It follows that F is increased by the interchange. 



Miscellanea 


417 


On the other hand, if the scores decrease with increasing rank separation the result is not necessarily 
true. For suppose as ± A, a large number, when p, -p t = ± 1, and a u = ± 1 as for Kendall’s r when 
I Pi — Pi | > It with similar scores b u for q f - q t . Then for the ranking 

8 6 7 1 4 2 5 3 

1 2 3 4 5 6 7 8 

the value of T is decreased on interchanging 6 and 4 in the first ranking. 

When tied ranks are present, any pair of ties is scored zero, but some consideration will show that 
« r is still increased if a discordant pair is brought to concordance by an interchange. 

The interchange of a particular pair of discordant ranks will in general alter the order of some of 
the other pairs which involve one or other of the ranks interchanged, but it is worth remarking that 
by virtue of the result just proved the value of r, and hence the total number of concordances between 
pairs of ranks, must be increased. 

REFERENCES 

Moban, P. A. P. (1948). Proc. Oatnb. Phil . Soc. 44, 142. 

Daniels, H. E. (1944). Biometrika , 33, 129. 


Approximation errors in distributions of independent variates 

By H. O. HARTLEY 

1. Let x and y be two independent variates and let q = <f>(x, y) be a function of these variates (e.g. 
the ratio xjy or the product x.y). In distribution theory one is frequently faced with the following 
problem: 

If we approximate to the distribution of x and/or y, what is the effect of such approximations on the 
distribution of ql 

In this note we derive a lemma which, although mathematically trivial, provides a gauge for this 
error and has been of great help in recent work. It appeared worth while therefore to put it on record 
(see §5). 

2. To fix the ideas we assume that Ocr<oo, Qcyccc, and that the differentiable function q = <j>(x, y) 
is monotonic increasing in x and monotonic in y, so that differentiable and monotonic inversion functions 
x = \Jf(q t y) and y = x) exist. It is obvious from the argument given below that some of these 
restrictions are in fact not essential. 

3. Let F(X ) be the chance for 0<a:<X and O(Y) be the ohance for Y, and denote byf{X) 

and g(Y) approximations to F and O. We shall assume that these approximations are themselves 
probability integrals’, so that 

lim f(X) = lim g(Y) = 1 and f'(X)2> 0, g'(Y)& 0, (1) 

Y-4-00 

where the distribution functions /' and g' are the differentials of f and g. Let us denote the differences 
between exact probability integrals and the approximations by 

e(X) =: F(X)-f(X), V (Y) = 0(Y)-g(T), (2) 

so that lim e(X) = lim 17(F) = 0, e(0) = ^(0) = 0. (3) 

X-+CQ Y-+ 00 

The probability integral of q, i.e. the chance (H(Q) say) of q = y)^Q, is then given by 

H(Q) = P r F'(x) G'(y) dxdy, 

Jo Jo 

4>(*’VXQ 


(4) 



418 


Miscellanea 


where F' and O' are the distribution functions of x and y. We have from (2) and (4) 

#<«) = P fV(*) g'(y) dxdy+ r f '°{F'v'+g'e'} dxdy 
J 0 J o Jo Jo 

= h(Q) +0(Q) (say). 

•- The integral h(Q) is the approximation to H(Q) computed from the approximate probability integrals 
/ and g and 6(Q) the error thereby committed. 



4, To estimate this error we have 


J «w mO.v) \ 

o r ^ J o F '^ ) dy+ jo d *) dy ~ 

Applying partial integration to the first term, ordinary integration to the second term and noting (3), 
we reach 


or 


0(Q) -~j 0 3/)) v(y)^(Q> s/)j dy+ j Q te'(y) e (^W» y))) d v 

r 4 KQ, oo) rc o 

0(Q) = - I F'(ir) &)) #+ {/(y) y))}<*y. 

J ^<0. ) J o 


Now since F'&O and g' & 0, and since 

oo) 


rtp(Q, oo) j I /*oo /• oo 

F’(f)df = 1, /(y)d»=l, 

J^< 0 , 0 ) | | J 0 Jo 

I 0 (Q) I «Smax | tj(y) | + max | e(x ) | , 


we obtain immediately 
and this inequality proves the following lemma: 


5. Lemma. Let x and y be independent variates with probability integrals F(X) and Q( Y) respec- 
tively; \etf(X) and g(Y) be approximations to F and O with errors c{X) and 17 (F) respectively; finally, 
let q = y) be a function of the variates x and y satisfying the above conditions (see § 2). If, then, 
the probability integral of q is computed from / and g, the error thereby committed is smaller than 
max \e | + max |i/ 1 . By repeated application of the lemma, the generalization to functions of more 
than two variates, q = <f>(x v x v ...x m ) f is obvious. 


Correlations between x * cells 
By F. N. DAVID 


1 . The population studied is assumed to fall into k groups or strata, there being a proportion p { 
(i = 1,2, in the ^th group. A sample of size N is randomly and independently drawn from the 
population, the number falling into the ith group being n*. We write 

m { = Np { and x t = n,— m*. 

It is well known that if the only restriction which is placed on the sample is that tho totals of observation 
and expectation are made to agree, i.e. if k 

X Xj " 0 , 

i -1 

then, writing r if for the coefficient of correlation between x t and x i9 

miTnj 

(N — m<) (N — Ttij) 



2. The coefficient of correlation between x 4 and x f when more them one restriction is placed on the 
sample is not easily determined. It may, however, be deduced from a consideration of the multivariate 
normal surface. The multivariate normal surface may be written 


p(x y x t ...x n ) = 




“ p (~ss [,?,*“ 3 


.» »-l » 

+ 2 I< 2 Rif - — 1 1 1 





Miscellanea 


419 


where 


R = I 1 


'n 

*81 



I r nl r nl r n8 ... 1 | 

R (i is the minor obtained by omitting the tth row and the^th column, and r u is the coefficient of correla- • 
tion between x 4 and x 4 . Consider the exponent, and write it as the sum of n linear squares, viz.: 


(«u* i + *«*, +«!»*»+ •; •+*»•*•)* + (*«*« + <*„*» + . . . + a,„ *„)* + . . . + a. 
Make the substitutions z l = cc n x t -f <z lt x t 4- ... 4- a ln # n . 


*8= * u X % +...+CL tn X n , 


z n = &nn x n» 

and solve for tho x'h. We have then a series of relations of the form 

X l — P\l Z l+ At 2 8 + ••• +ftln*n» 
X M ~ /?18 2 8+ ••• + /?ln 2 n» 


~ Ann z n* 

If we now put = a* t = . . . = # n = 0, we have n equations whioh may be regarded as n planes in an 
n -dimensioned space. If 0 U is the angle between the tth and theTjth piano, then 

0 U = cos* 1 r ijt 

3. I believe this last relation to be well known,* although I have not been able to find a reference to its 
proof anywhere. The proof is, however, quite straightforward, making use of tho well-known relations 
between the minors of a determinant. It is suggested that this result may be used to find the correlation 
between x t and x t of § 1 . It is assumed that the number, N , in the sample, and the number of groups are 
large enough for the assumptions regarding to be satisfied ; that is to say, it is assumed that x 4 is normally 
distributed about m t for (i = l, 2, ...,&). We consider a x 2 of k groups where we have placed p linear 
restraints on the x's. Tho probability P{x % > )&}> where is some constant, is given by the multiple 
integral in a (k— p) -dimensioned space taken over the domain D defined by We have then 

P{x* > X?} = constant x J J\ . . J exp £ - $ ^ — • J chr 1 dx i ... dx k _ v , ( 1 ) 

where ...,x k can be expressed in terms of x l9 x t , by solving the equations by means of 

which the linear restraints are expressed. The expression under the integral sign is equivalent to a multi- 
variate normal distribution in k— p dimensions; accordingly, by writing the exponent in the way de- 
scribed in § 2 and finding the angle between the appropriate planes x 4 = 0 and x i = 0, r if can be deduced. 


4. As an illustration consider the case of a population which is divided into five groups. A sample of N 
is drawn, and, using the notation of § 1, we havo that the correlation between x x and x t is, for the case of 
one restraint. 


/ m 1 m l \« 
11 ' 


We begin by showing tliat this may be deduced by the method outlined here. If the one restraint is that 
the sum of the a?’s is zero, then we have 


^1 + ^8 + *8 + *4 + ^5 = 0 or -x h = x t + x % + a? 4 . 

The exponent under the X* integral may therefore be written (dropping the multiplier — J for convenience) 

, *5 , (®i + %% + 4* # 4 )* 

m t m a m 4 m 5 


* It is easily deduced from determinantal relations given by K. Pearson in lectures and in various 
papers dealing with multiple correlation. M. G. Kendall points out that it follows as a natural result of 
the discussion on p. 372 of his 4 dvanced Theory of Statistics , vol. 1. Sheppard used the bivariate result as 
a means of estimating the correlation coefficient. 



Miscellanea 


420 

This may be rewritten as 

r /n^y + / y + / y + / 


"h \ 


y+xt 

f m a 1 

|* 1 

f m, 

|‘T 

) + , i 

^(»*i + m t + m 4 ) (m^ + *»,); 

! +a 4 

i(>») + «l| + Wl 4 ) (Wj + »»,); 

u 


f / y-w t \» / m, \n* r / n \n» 

+ L*\ m »(»»i + *»,+»»»)/ +a5 ‘\(iV-w 4 )(m 1 +m, + »n 5 j/ J + L* 4 W 4 (N-m 4 )/ J ' 

Substituting z x for the expression in the first bracket, z % for the second and so on, and solving for 
x v x* and x 4 in terms of the z' s, we have 

/ m x m fi \* 77^ m,* nqm,* m t m 4 * 


( mjina \* m 1 m i 

m i + 1 + 


nqm,* m x m 4 * 

+ m 5 )* (mj + m, + w 6 )* 8 (mjH-m^w,)*^ — m 4 )* * — m 4 )* 


* (r^ + Wj + mj)* 1 ( 7 /^ + 7^g4-m 6 )*(jV~m 4 )i 8 #t(N-m 4 )l 4 * 

+ w B )*7n.* m.m 4 * 

* 8 ~ (tf-m 4 )* 

(N — rn 4 )* m 4 * 

* 4 - jjt * 4 ’ 

The angle between the planes a?j = 0 and a?, = 0 will be 


6 lt = cos -1 


m i m i _ -i / m i w i \* 

aT/V s j \ n / _C08 " “VArr^W-mJ • 

For this case the result may also be reached by calculating 

n „ . I W,W 4 \* 

M \(N - m 4 ) (N — m 4 )/ 

and deducing *u by symmetry. 

6. It will be noted that the restraint placed on in the preceding section is equivalent to making the 
totals of observation and expectation agree, as was pointed out in § 1. We now proceed to place such 
further restraints on x % as are made when moments of observation and expectation are put in agreement. 
Admittedly this narrows the field of investigation to a certain extent, but it may be argued that such 
restraints are those which are most often met in practice. We assume therefore that the two restraints 
placed on the sample are + *, + *, + *. + = 0. 

-2r 1 -x, + a; 4 + 2r 6 = 0, 

or that the totals of observation and expectation have been made to agree, and that the population mean 
lias been estimated from the sample. We solve for x i and x 6f express the quantity 

5 

in terms of x 19 x % and x a only, write the expression so obtained as the sum of three linear squares and, after 
substitution of the new variables z lf z a and z a , solve for the x'n. We thus obtain 

_ (m 1 m 4 rn 6 )* m 1 m,*( 1 2m fi + 6 m 4 ) -f 3m 4 — m % ) 

Xl — - 2 X JTb* ** Wei *»* 

__ A * m f * 6 m 5 4* 2 m 4 + 2 m 1 ) 

z„ 

m s * 

x *~ ~cr **- 

where A = m 4 m 8 + 1 67747715 + 9m 1 m 4 , 

B = A + m,(9m 6 + 4m 4 -f Wh), 

(7 = £ + 771,(47715 4- 771 4 4“ 771, *f 4771!). 

The cosine of the angle between the planes x x = 0, a:, = 0 is 

( 12m, 4- 6m 4 4* 2m t ) (m 1 m i )* 
r *» _ __ . 


x l ~ — 



Miscellanea 


421 


where F = m,m t + 4 rn s m t + 1 6m t m t + m i m l + 9m 4 tn 1 + 4m t m v 

O = m 8 m 4 4- 4m 5 m, 4- 9m 6 mj 4* m 4 m, 4 4m 4 m t 4- m t m |( 

Again it will be noted that owing to symmetry in the restraints r 48 may be deduced from r lt by the sub- 
stitution of the appropriate indices. 

6. It is seen, as in fact is expected, that the correlation between x { and x i for more than one restraint 
depends on the shape of the population from which the sample has been drawn, i.e. it depends on the 
relative values of m lt m v . . . , and it will also depend on the type of restraint which is placed on the sample. 
The behaviour of these correlations is interesting and study of them arose out of a larger investigation 
into the relations between the signs of the deviations as the number of moment-restraints is increased. 
Thus it is found that the correlatiombe tween x x (say) and x t% negative for one restraint, increases numeri- 
cally to — 1 as the number of moment -restraints increase. On the othor hand, the correlation between x x 
and # a , negative for one restraint, decreases numerically and finally increases to 4- 1 with increasing 
numbers of moment -restraints. Generally we may expect the correlations between two x’a with odd 
subscripts or two x's with the even subscripts to tend to 4 1 , and the correlations between an x with an 
odd subscript and an x with an even subscript to tend to — 1. 

7 . Several tests regarding the signs of the deviations, x t have been proposed recently and the opinion 

has been expressed that these signs could be regarded as effectively independent for the case where more 
than one restraint is placed on the material. For such sign tests what is important is the correlation 
between adjacent deviations and this will, for the case of moment -restraints at any rate, increase 
numerically to — 1 ; for the case of extreme deviations this increase will bo a rapid one. For illustrative 
purposes I have considered two cases, the first where all the m ' s are given equal weight, i.e. letting 
m, = m % = mj = . . . = m k and second where m 1 = m k < m 2 = m k _ x < m 8 = etc. For the first case the 

correlation between x x and x % was found for fivo groups and 1, 2, 3 moment-restraints, seven groups and 
1, 2, 3, 4 moment -restraints, nine groups and 1, 2, 3, 4 moment -restraints. The correlation coefficients are 
given in Table 1. 


Tablo 1. Correlation between x x and x t ; equal weighting of expectations 


No. of groups 

No. of restraints 

. .. 

5 

1 2 3 

7 

12 3 4 

9 

12 3 4 

Correlation, r ia 

l 

-0-25 -0*70 -0-92 

-0 17 -0-58 -0*87 -0*95 

-0*12 -0-43 0*80 -0-92 

i 


For the second case the correlations were compartxl for five groups only and the results are given in the 
last row 7 of Table 2. 


Table 2. Correlation between anti x z ; expectations weighted equally and unequally 


No. of groups 

No. of restraints 

5 

1 2 3 

Correlation, r 12 : Equal weights 

Unequal weights 

-0-25 -0*76 -0*92 

-0 12 -0*57 -0*94 


The precise numerical value of these correlation coefficients is not important; what is important is the 
rapidity with which the correlation increases as the number of moment -restraints is increased. This rapid 
increase would suggest that any test derived on the assumption of the independence of signs of deviations, 
for extreme observations at any rate, is of doubtful validity. 

8. As a check on theory I considered the first 100 samples of the 208 samples, each of 200 observations, 
used by Neyman A Pearson (1928) as an illustration in their x * paper. The population in this case is a 


* Actually the frequencies were obtained by dividing the normal curve between mean ± 3 <r into groups 
with equal base range. 




422 


Miscellanea 


cubic divided into eight groups. I have shown previously (David, 1947) that for the case where one 
restraint is placed on the material the order of the signs of the deviations can be regarded as random, thus 
indicating that the effect of the correlation is not felt. We now consider the case when four restraints are 
placed on the material; the totals of expectation and observation are made to agree and the first three 
moments are estimated from the data. The correlation between x x and x % is now equal to — 0*840, and the 
correlation between x x and x 9 is *f 0*213. It will be noted that r w is comparable in magnitude with the 
correlations given in Table 1 for 4 restraints and equal weighting of the expectations. It is clear 
that both these correlations are too large to be neglected, that between x x and x % being of sufficient 
magnitude seriously to invalidate any sign test which is based on a hypothesis of independence or of 
randomness. 


9. It will be realized that the manner in which restraints are imposed on the differences x 4 = n 4 — m 4 is 
not quite the same in the problem we have been considering as in that arising in tests where a theoretical 
law is fitted to the data. In the former case the x 4 satisfy certain linear relations because we confine 
attention to the set of samples for which the n 4 satisfy certain conditions. In the second case the n 4 are 
unrestricted except for the condition k 

L n, = N, 

but in calculating x 4 wo substitute m 4 for m i9 where these estimates aro so chosen that n 4 — m< satisfy 
similar conditions. 

This question has been discussed at some length only by Neyman & Pearson (1928) and is invariably 
inadequately treated in statistical text -books. I do not propose to discuss the matter fully here but 
would offer certain remarks which lead me to believe that the substitution of sample estimates, (n, for the 
population values, m, in ( 1 ) will not seriously invalidate the calculated correlations. 

We may consider for simplicity a population of four groups. As before, we shall have 


and we shall write 


x { = rii — m 4 , 

X, = mi- mi = /,( 0 ), 


where 0 is the estimate of the population parameter which is calculated from the observations. Bocause 
0 is an estimate it will vary from sample to sample and the estimated values X v X v X 9 , X 4 will lie on 
a curve (termed by Neyman & Pearson the population locus), which will depend on 0 alone. It is clear 
therefore that for any given set of observations x lf x v x 8 . x 4 the est imated population point will depend on 
the method of fitting used, or, perhaps more precisely, on the method of estimation of 0. 

If the structure of the restraint placed on the observations for t he purpose of estimating 0 is such as to 
lead to a minimum 4 

X a = 2 

<-i 


(n,-m,) a 

'A- 


then the correlations, obtained under the assumption that ( 1 ) is true when m 4 is substituted for m 4 , will be 
approximately correct. In general the method of moments will not lead to this minimum X* exactly, but the 
fitted expectations will not usually be very different from the minimum values, and the error made in 
assuming ( 1) is true will not be large. 


REFERENCES 

David, F. N. (1947). Biometrika, 34, 299. 

Neyman, J. & Peabson, E. S. (1928). Biometrika , 20 A, 274. 


Note on 'Proofs of the distribution law of the second order moment statistics ’ 

« 

By JOHN WISHART 

Since the paper under the above title was published (1948, Biometrika , 35, 55), my attention has 
been called to a paper by E. Sverdrup, ‘Derivation of the Wishart Distribution of the Second 
Order Sample Moments by straightforward Integration of a Multiple Integral’ (1947, Skand. 
AktuarTidakr . 30, 151). A copy has also been received of a paper by R. D. Narain, *A New Approach 
to Sampling Distributions of the Multivariate Normal Theory’, in which the same distribution is 
derived. This paper will shortly be published in India. 



[ 423 ] 


REVIEW 

Probit Analysis. By D. J. Finney, M.A. Cambridge University Press. 1947. Price: 18s. 

The publication of books relating to specialized statistical techniques is a welcome feature of recent 
years. . In such books it is possible to give an account sufficiently detailed to provide a practical guide 
to the user of these techniques in most of the problems to be encountered, as opposed to the necessarily 
more restricted treatment of any particular topic in a general text-book. Mr Finney's volume on probit 
analysis, with special emphasis on applications in biological research, is an excellent example of this 
specialized type of book, not only taking full advantage of the possibility of exhaustive treatment, but 
also presenting the arguments in such a way as to ease the reader’s task in mastering the methods 
described. 

Starting from a simple description of the types of problem to be treated, techniques covering almost 
every eventuality in the field commonly described as 'dosage-mortality’ are developed and their 
practical application illustrated by means of numerical examples. The standard methods of probit 
analysis are carefully described and very useful sets of tables (Tables I-IV) are provided. Tables III 
and IV, in particular, should considerably facilitate the computation of working probits. Table III gives 
both maximum and minimum working probits for expected probits at intervals of O’ 1 ; Table IV enables 
the working probit to be obtained directly from the provisional probit and the percentage ‘kill*. 
Table II, apart from giving values of Q/Z, gives values of the weighting coefficient allowing for the effeot 
of natural mortality. Chapter 6 is devoted to the consideration of methods of allowing for natural 
mortality and should prove most useful. While the methods described are not new, it is the first time 
that they have appeared in a systematic treatise in such a way as to bring them to general notice. 
Similar remarks might be made about the other special features of the book — the treatment of factorial 
experiments, of the joint action of different poisons and of quantitative responses, for example. Most 
of the methods described liave already been published in individual papers, but Mr Finney has per- 
formed an important task in bringing them together in one consolidated account. The method of 
development is not, of course, theoretically complete or rigorous, but it is logically clear, and should 
provide an invaluable aid to experimentalists in their appreciation of the statistical tests described. 

It may seem ungenerous to extract for special criticism the very few passages in this useful book 
which seem to be ambiguous or capable of improvement. Such criticism will be made, however, as 
a modest attempt to help readers of the work to avoid certain difficulties and confusion which they 
might possibly experience. It appears to the reviewer that some difficulties may result from the 
author’s desire to simplify the presentation of the underlying theory. While it must be admitted that 
some simplification is necessary, this should not be such as to allow of false impressions of the theoretical 
situation. As it appears that considerable knowledge of general statistical theory is expected of the 
reader, there would seem to be no need to avoid the use of simple technicalities in clarifying the 
exposition. 

We may instance the arguments for introducing the heterogeneity factor (p. 33), the working probit 
(p. 48) and the correction for natural mortality (p. 88). In the first of these cases it should be noted 
that the interval of estimation will be on the average too long if there is no heterogeneity; while if 
heterogeneity is present the occasional use of normal factors will be incorrect. On p. 48, the implication 
that it is merely the asymmetry of the distribution of p which gives rise to the need for working probits 
gives a rather incomplete notion of the theory. Finally, the arguments on corrections for natural 
mortality could be much simplified if they were based directly on equation (6*2). 

The use of both ld 50 and ed 50 to symbolize the same population parameter, ld 50 being restricted 
to cases where the response is death, may be somewhat confusing. We note, indeed, that the author 
himself uses the symbol ed 50 on p. 124, when considering ‘kills ’. It may save some confusion to point 
out that the quantity defined as on p. 117, is afterwards represented as x 12 (there is also a mis- 
placed bracket in the first equation on this page) ; there is also a misprint on p. 89, where the reference 
to equation (3*5) should be to equation (3*4). 

The comparatively minor importance of the above criticisms must be emphasized. The book as 
a whole is a valuable addition to the statistician’s library and is indispensable for persons concerned 
with ‘dosage-mortality* problems, in whatever form they arise. Apart from its other merits, the very 
full bibliography makes the book invaluable as a reference index in its own field. May other works of 
this specialized nature maintain the standard set by Mr Finney ! n. l. j. 



[ 424 ]■ 


CORRIGENDA 

1. .Moments of the mean deviation from the mean in samples from a normaf population. (See 
in paper by H. J. Godwin, p. 308 above.) The following corrections should be made: 

(M) R. C. Geary (1930), Biometrika , 28: 

p. 300 In equation (22) the coefficient of in the expansion for m\ should read 
(51-352o t +H W <* 2 * 4 )» not (51 - 352a* + 427a 4 ). 

P* equation (24) the coefficient of n~* in the expression for m\ — m\ should read 

0-11403500, not 0-03357805. 

(1.2) E. S. Pearson (1945), Biometrika , 33, p. 252: 

In equation (4) the coefficient of in the expansion for A 4 shbuld read -0-038940, not 
-0-120003. This correction makes alterations in the values for A 4 and /?, given in the table 
.of moments of the mean deviation on p. 253, which should read : 


Sample size n 

a 4 

A 

4 

0-001 963 

3-252 

5 

•000 9912 

3-197 

6 

•000 5672 

3-101 

8 

•000 2356 

3-118 

10 

•000 1195 

3-093 

12 

•000 00808 

3-070 

15 

-000 03493 

3-061 

20 

•0(H) 01404 

3*045 


2. Biometrika , 35, 181 (1948): 

In the list of references at the end of F. Yates’s paper, that to E. S. Pearson (1923) should 

read Biometrika, 14 , 261, not Biometrika, 7. 248. 






